Rapid subcloning using site-specific recombination

ABSTRACT

The present invention provides compositions, including vectors, and methods for the rapid subcloning of nucleic acid sequences in vivo and in vitro. In particular, the invention provides vectors used to contain a gene of interest that comprise a sequence-specific recombinase target site. These vectors are used to rapidly transfer the gene or genes of interest into any vector that contains a sequence-specific recombinase target site located downstream of a regulatory element so that the gene of interest may be regulated.

[0001] This is a Continuation-In-Part Application of pending applicationSer. No. 08/864,224, filed Feb. 28, 1997.

FIELD OF THE INVENTION

[0002] The invention relates to recombinant DNA technology. Inparticular, the invention relates to compositions, including vectors,and methods for the rapid subcloning of nucleic acid sequences in vivoand in vitro.

BACKGROUND OF THE INVENTION

[0003] Molecular biotechnology has revolutionized the production ofprotein and polypeptide compounds of pharmacological importance. Theadvent of recombinant DNA technology permitted for the first time theproduction of proteins on a large scale in a recombinant host cellrather than by the laborious and expensive isolation of the protein fromtissues which may only contain minute quantities of the desired protein(e.g., isolation of human growth hormone from cadaver pituitary). Theproduction of proteins, including human proteins, on a large scale in aheterologous host requires the ability to express the protein ofinterest in the heterologous host. This process typically involvesisolation or cloning of the gene encoding the protein of interestfollowed by transfer of the coding region into an expression vector thatcontains elements (e.g., promoters) which direct the expression of thedesired protein in the heterologous host cell. The most commonly usedmeans of transferring or subcloning a coding region into an expressionvector involves the in vitro use of restriction endonucleases and DNAligases. Restriction endonucleases are enzymes which generally recognizeand cleave a specific DNA sequence in a double-stranded DNA molecule.Restriction enzymes are used to excise the coding region from thecloning vector and the excised DNA fragment is then joined using DNAligase to a suitably cleaved expression vector in such a manner that afunctional protein may be expressed.

[0004] The ability to transfer the desired coding region to anexpression vector is often limited by the availability or suitability ofrestriction enzyme recognition sites. Often multiple restriction enzymesmust be employed for the removal of the desired coding region and thereaction conditions used for each enzyme may differ such that it isnecessary to perform the excision reactions in separate steps. Inaddition, it may be necessary to remove a particular enzyme used in aninitial restriction enzyme reaction prior to completing all restrictionenzyme digestions; this requires a time-consuming purification of thesubcloning intermediate. Ideal methods for the subcloning of DNAmolecules would permit the rapid transfer of the target DNA moleculefrom one vector to another in vitro or in vivo without the need to relyupon restriction enzyme digestions.

SUMMARY OF THE INVENTION

[0005] The present invention provides reagents and methods whichcomprise a system for the rapid subcloning of nucleic acid sequences invivo and in vitro without the need to use restriction enzymes.

[0006] The present invention provides a method for the recombination ofnucleic acid constructs, comprising: providing a first nucleic acidconstruct comprising, in operable order, an origin of replication, afirst sequence-specific recombinase target site, and a nucleic acid ofinterest, a second nucleic acid construct comprising, in operable order,an origin of replication, a regulatory element and a secondsequence-specific recombinase target site adjacent to and downstreamfrom the regulatory element, and a site-specific recombinase; contactingthe first and the second nucleic acid constructs with the site-specificrecombinase under conditions such that the first and second nucleic acidconstructs are recombined to form a third nucleic acid construct,wherein the nucleic acid of interest is operably linked to theregulatory element. The present invention contemplates the use of anytype of regulatory element. In some embodiments of the presentinvention, the regulatory element comprises a promoter element, a fusionpeptide (e.g., an affinity domain), or an epitope tag. In preferredembodiments, the nucleic acid of interest comprises a gene.

[0007] In some embodiments, the first nucleic acid construct furthercomprises a selectable marker. In other embodiments, the second nucleicacid construct further comprises a selectable marker. The presentinvention contemplates that the first and second nucleic acid constructsboth comprise selectable markers. In preferred embodiments theselectable markers of the first and second nucleic acid constructs aredifferent from one another. Selectable markers include, but are notlimited to a kanamycin resistance gene, an ampicillin resistance gene, atetracycline resistance gene, a chloramphenicol resistance gene, astreptomycin resistance gene, a spectinomycin resistance gene, the aadAgene, the ΦX174 E gene, the strA gene, and the sacB gene.

[0008] In preferred embodiments, the first nucleic acid constructfurther comprises a prokaryotic termination sequence. Prokaryotictermination sequences include, but are not limited to the T7 terminationsequence. In other preferred embodiments, the first nucleic acidconstruct further comprises a eukaryotic polyadenylation sequence.Polyadenylation sequences include, but are not limited to, the bovinegrowth hormone polyadenylation sequence, the simian virus 40polyadenylation sequence, and the Herpes Simplex virus thymidine kinasepolyadenylation sequence. In yet other preferred embodiments, the firstnucleic acid construct further comprises a conditional origin ofreplication.

[0009] In preferred embodiments of the present invention, the first andsecond sequence-specific recombinase target sites are selected from thegroup consisting of loxP, loxP2, loxP3, loxP23, loxP511, loxB, loxC2,loxL, loxR, loxΔ86, loxΔ117, frt, dif, loxH and att. The presentinvention contemplates that the first and second sequence-specificrecombinase target sites may comprise the same sequence or may comprisedifferent sequences.

[0010] In yet other embodiments of the present invention, the firstnucleic acid construct further comprises a polylinker.

[0011] The present invention contemplates that the recombination methodscan be used in vitro and in vivo. In some in vivo embodiments, thesite-specific recombinase is provided by a host cell expressing thesite-specific recombinase. In some in vivo methods, the contacting ofthe first and the second nucleic acid constructs with the site-specificrecombinase comprises introducing the first and said second nucleic acidconstructs into a host cell under conditions such that the third nucleicacid construct is capable of replicating in the host cell.

[0012] The present invention further provides methods for precisetransfer of nucleic acid molecules by recombination. In someembodiments, the first nucleic acid construct further comprises a thirdsequence-specific recombinase target site and said second nucleic acidconstructs further comprises a fourth sequence-specific recombinasetarget site. In preferred embodiments, the first sequence-specificrecombinase and the third sequence-specific recombinase in the firstnucleic acid construct are located on opposite sides of the nucleic acidof interest. It is contemplated that the first and thirdsequence-specific recombinase target sites are contiguous with, adjacentto, or distant from the nucleic acid of interest. In particularlypreferred embodiments the third and fourth sequence-specific recombinasetarget sites are selected from the group consisting of RS sites and Ressites, although other target sites are contemplated by the presentinvention. In some embodiments of the this method of the presentinvention, the first nucleic acid construct further comprises a thirdsequence-specific recombinase target site and the second nucleic acidconstructs further comprises a fourth sequence-specific recombinasetarget site, wherein the method further comprises providing a secondsite-specific recombinase and the step of contacting the third nucleicacid construct with the second site-specific recombinase underconditions such that the third nucleic acid construct is recombined toform a fourth and a fifth nucleic acid construct.

[0013] The present invention also provides a recombined nucleic acidconstruct prepared according to any of the above methods.

[0014] The present invention further provides a method for therecombination of nucleic acid constructs, comprising: providing avector, a linear nucleic acid molecule comprising a sequencecomplementary to at least a portion of said vector, and an E. coli hostcell, wherein said host cell comprises an endogenous recombinationsystem, a loss of function rec mutation, a suppressor, and a loss offunction endogenous restriction modification system mutation; andintroducing the vector and the linear nucleic acid molecule into thehost cell under conditions such that the linear nucleic acid moleculeand the vector are recombined to form a recombinant nucleic acidconstruct. In preferred embodiments the loss of function rec mutation isselected from the group consisting of recBC and recD. In other preferredembodiments, the suppressor comprises sbc. In yet other preferredembodiments, the loss of function endogenous restriction modificationsystem mutation comprises hsdR.

[0015] The present invention further provides a method for generating anucleic acid fusion on the 3′ end of the nucleic acid of interest in thefirst nucleic acid construct from above, comprising: providing a taggedlinear nucleic acid sample comprising a tag to be added to the 3′ end ofthe nucleic acid of interest, and a sequence complementary to a regionof the first nucleic acid construct that is 3′ of the nucleic acid ofinterest; and a host cell capable of endogenous homologous recombinationof complementary nucleic acid molecules; and introducing the taggedlinear nucleic acid sample and the first nucleic acid construct into thehost cell under conditions such that the tagged linear nucleic acidsample and the first nucleic acid construct are recombined to form atagged nucleic acid construct.

[0016] The present invention further provides a method for the cloningof nucleic acid libraries, comprising: providing a plurality of firstnucleic acid constructs comprising, in operable order, an origin ofreplication, a first sequence-specific recombinase target site, and anucleic acid member from a nucleic acid library, a plurality of secondnucleic acid construct comprising, in operable order, an origin ofreplication, a regulatory element and a second sequence-specificrecombinase target site adjacent to and downstream from the regulatoryelement, and a site-specific recombinase; contacting the plurality offirst and second nucleic acid constructs with the site-specificrecombinase under conditions such that the plurality of first and secondnucleic acid constructs are recombined to form a plurality of thirdnucleic acid constructs, wherein the nucleic acid members from thenucleic acid library are operably linked to the regulatory elements. Thepresent invention further provides a nucleic acid library preparedaccording to the above method.

[0017] The present invention also provides a method for the directionalcloning of a nucleic acid molecule, comprising: providing first andsecond portions of a regulatory element, a first nucleic acid moleculecomprising the first portion of the regulatory element; and a secondnucleic acid molecule comprising the second portion of the regulatoryelement; and combining the first and the second nucleic acid moleculesto produce a third nucleic acid molecule under conditions whereby anintact regulatory element is produced from the combination of the firstand the second portions of the regulatory element, wherein the presenceof the intact regulatory element in the third nucleic acid moleculeindicates a direction of cloning of the first nucleic acid molecule withrespect to the second nucleic acid molecule.

[0018] The present invention also provides a method for the directionalcloning of a nucleic acid molecule, comprising providing: the nucleicacid molecule to be cloned, a first primer comprising sequencecomplementary to the nucleic acid molecule, a second primer comprisingsequence complementary to the nucleic acid molecule and sequencecorresponding to a first portion of a lacO site, amplification means,and a target nucleic acid molecule comprising a second portion of thelacO site; amplifying the nucleic acid molecule with the first andsecond primers to produce a modified nucleic acid molecule comprisingthe first portion of a lacO site; and ligating the modified nucleic acidmolecule into the target nucleic acid such that, when cloned in thedesired direction, an intact lacO site is produced. In some embodiments,the method further comprises the step of detecting the intact lacO site.In particularly preferred embodiments, the target nucleic acid moleculecomprises pUNI-30.

[0019] The present invention further provides a method for regulatedrecombination in host cells that constitutively express a recombinase,comprising: providing a host cell expressing a recombinase, a firstnucleic acid construct comprising an origin of replication, a firstsite-specific recombinase site, a second site-specific recombinase sitethat differs in sequence from the first site-specific recombinase sitesuch that the recombinase will not initiate recombination between thefirst and second site-specific recombinase sites, and a selectablemarker gene between the first and second site-specific recombinasesites, and a second nucleic acid construct comprising an origin ofreplication, a third site-specific recombinase target site, and a fourthsite-specific recombinase target site that differs in sequence from thethird site-specific recombinase site such that the recombinase will notinitiate recombination between the third and fourth site-specificrecombinase sites; and introducing the first and second nucleic acidconstructs into the host cell under conditions such that the first andsecond nucleic acid constructs are recombined. In some embodiments, themethod further comprises the step of selecting for a desired recombinantnucleic acid molecule using the selectable marker. In preferredembodiments, the first nucleic acid construct is a Univector. Inalternative preferred embodiments, the second nucleic acid construct isa Univector.

[0020] The present invention also provides, a nucleic acid constructcomprising, in operable order: a conditional origin of replication; asequence-specific recombinase target site having a 5′ and a 3′ end; anda unique restriction enzyme site, said restriction enzyme site locatedadjacent to the 3′ end of the sequence-specific recombinase target site.In some embodiments, the construct further comprises a prokaryotictermination sequence. In yet other embodiments, the construct furthercomprises a eukaryotic polyadenylation sequence. The present inventioncontemplates the use of any prokaryotic termination sequence and anyeukaryotic polyadenylation sequence. In preferred embodiments, theconstruct further comprises one or more selectable marker genes.Selectable marker genes include, but are not limited to the kanamycinresistance gene, the ampicillin resistance gene, the tetracyclineresistance gene, the chloramphenicol resistance gene, the streptomycinresistance gene, the strA gene, and the sacB gene. In preferredembodiments, the sequence-specific recombinase target site is selectedfrom the group consisting of loxP, loxP2, loxP3, loxP23, loxP511, loxB,loxC2, loxL, loxR, loxΔ86, loxΔ117, frt, dif, loxH and att.

[0021] In some embodiments the construct further comprises a gene ofinterest inserted into the unique restriction enzyme site. In particularembodiments, the construct has the nucleotide sequence set forth in SEQID NO:1 (FIG. 26A). In other embodiments, the construct furthercomprises a second sequence-specific recombinase target site. Inpreferred embodiments, the second sequence-specific recombinase targetsite is selected from the group consisting of RS site and a Res site. Inyet other embodiments, the construct further comprises a polylinker.

[0022] The present invention further provides a nucleic acid constructcomprising in 5′ to 3′ operable order: an origin of replication; apromoter element having a 5′ and a 3 end; and a sequence-specificrecombinase target site having a 5′ and a 3′ end. In some embodiments,the construct further comprises a selectable marker gene.

[0023] The present invention also provides a nucleic acid constructcomprising in operable order: a promoter element having a 5′ and a 3′end; a first sequence-specific recombinase target site having a 5′ and a3′ end, wherein the 3′ end of the promoter element is located upstreamof the 5′ end of the sequence-specific recombinase target site; a geneof interest joined to the 3′ end of the sequence-specific recombinasetarget site such that a functional translational reading frame iscreated; a conditional origin of replication; a first selectable markergene; a second sequence-specific recombinase target site; and an originof replication. In some embodiments, the construct further comprises asecond selectable marker gene.

[0024] The present invention also provides a method for therecombination of nucleic acid constructs, comprising: providing a firstnucleic acid construct comprising a loxH site, a second nucleic acidconstruct comprising a loxH site; and a site-specific recombinase; andcontacting the first and the second nucleic acid constructs with thesite-specific recombinase under conditions such that the first andsecond nucleic acid constructs are recombined. The present inventionalso provides a recombined nucleic acid construct prepared according tothe above method.

DESCRIPTION OF THE DRAWINGS

[0025]FIG. 1 provides a schematic illustrating certain elements of thepUNI vectors and the Univector Fusion System.

[0026]FIG. 2A provides a schematic map of the pUNI-10 vector; thelocations of selected restriction enzyme sites are indicated and uniquesites are indicated by the use of bold type.

[0027]FIG. 2B shows the DNA sequence of the loxP site and thepolylinkers contained within pUNI-10 (i.e., nucleotides 401-530 of SEQID NO:1).

[0028]FIG. 3A shows the oligonucleotides (SEQ ID NOS:4 and 5) which wereannealed to insert a loxP site into the polylinker of pGEX-2TKcs tocreate pGst-lox.

[0029]FIG. 3B provides a schematic map of pGEX-2TKcs which includes anenlargement of the multiple cloning site (MCS).

[0030]FIG. 4A shows the oligonucleotides (SEQ ID NOS:6 and 7) which wereannealed to insert a loxP site into the polylinker of pVL1392 to createpVL1392-lox.

[0031]FIG. 4B provides a schematic map of pVL1392 which includes anenlargement of the multiple cloning site (MCS); the ampicillinresistance gene (Ap^(R)) and the tac promoter (P_(tac)) are indicated.

[0032]FIG. 5A shows the oligonucleotides (SEQ ID NOS:8 and 9) which wereannealed to insert a loxP site into the polylinker of pGAP24 to createpGAP24-lox.

[0033]FIG. 5B provides a schematic map of pGAP24 which includes anenlargement of the multiple cloning site (MCS); the ampicillinresistance gene (Ap^(R)), the GAP promoter (P_(GAP)), the origin fromthe 2 μm circle (2 μ) and the TRP1 gene, encodingN-(5′-phosphoribosyl)-anthranilate synthetase, (TRP1) are indicated.

[0034]FIG. 6A shows the oligonucleotides (SEQ ID NOS:8 and 9) which wereannealed to insert a loxP site into the polylinker of pGAL14 to createpGAL14-lox.

[0035]FIG. 6B provides a schematic map of pGAL14 which includes anenlargement of the multiple cloning site (MCS); the ampicillinresistance gene (Ap^(R)), the GAL promoter (P_(GAL)), the yeastcentromeric sequences (CEN), yeast autonomous replication sequences(ARS) and the TRP1 gene (TRP1) are indicated.

[0036]FIG. 7 shows a Coomassie blue-stained SDS-PAGE gel showing thepurification of Gst-Cre from E. coli cells containing pQL123.

[0037]FIG. 8 provides a schematic showing the strategy employed for thein vitro recombination of a pUNI vector (“pA,” pUNI-5) with a pHOSTvector (“pB,” pQL103) to create a fused construct (“pAB”). The relevantmarkers on each construct are indicated, as are selected restrictionenzyme sites.

[0038]FIG. 9A provides a schematic showing the starting constructs(pUNI-Skp1 and pGst-lox) and the predicted fusion construct (pGst-Skp1)generated by an in vitro fusion reaction.

[0039]FIG. 9B provides an ethidium bromide-stained gel showing theseparation of restriction fragments generated by the digestion ofpUNI-Skp1, pGst-lox and pGst-Skp1.

[0040]FIG. 10A shows a Coomassie blue-stained SDS-PAGE gel showing theexpression of the Gst-Skp1 protein from E. coli cells containingpGst-Skp1.

[0041]FIG. 10B shows a Western blot of an SDS-PAGE gel containingextracts prepared from E. coli cells containing pGst-Skp1 which wasprobed using an anti-Skp1 antibody.

[0042]FIG. 11 shows a Western blot of an SDS-PAGE gel containingextracts prepared from E. coli cells (QLB4) containing either aconventionally constructed Gst-Skp1 plasmid or pGst-Skp1 (produced by anin vitro fusion reaction).

[0043]FIG. 12 provides a schematic illustrating the in vivo gene trapmethod for the recombination of lox-containing vectors in a host cellconstitutively expressing the Cre protein.

[0044]FIG. 13 provides the nucleotide sequence of the wild-type loxPsite (SEQ ID NO:12), the loxP2 site (SEQ ID NO:13), the loxP3 site (SEQID NO:14) and the loxP23 site (SEQ ID NO:15).

[0045]FIG. 14 shows a schematic for one embodiment of Cre-mediatedplasmid fusion.

[0046]FIG. 15 shows data demonstrating the efficiency of Gst-Crerecombinase activity as measured by UPS.

[0047]FIG. 16 shows the protein expression of UPS generated fusionproteins containing loxP following separation by SDS-PAGE and (A)staining with Coomassie blue, and (B) immunoblotting with anti-Skp1antibodies.

[0048]FIG. 17 shows a comparison of expression levels between loxP andloxH containing constructs.

[0049]FIG. 18 shows the expression of UPS-derived baculovirus expressionconstructs in insect cells.

[0050]FIG. 19 shows immunblotting with anti-HA antibodies of Hela cellsexpressing Myc-tagged F-box protein under the control of the CMVpromoter.

[0051]FIG. 20 shows a schematic representation of the POT reaction.

[0052]FIG. 21 shows restriction digestion assays of sample thatunderwent POT with SKP1 replacing the E gene in pAS2-E.

[0053]FIG. 22 shows a schematic of a method for directional subcloningof nucleic acid samples into a Univector.

[0054]FIG. 23 provides a schematic map of the pUNI-10, UNI-20, andpUNI-30 vectors.

[0055]FIG. 24 shows a schematic of a method for producing a taggedrecombinant protein.

[0056]FIG. 25 shows a schematic of a gap repair scheme for modificationof the 3′ end of coding regions using homologous recombination.

[0057]FIG. 26 shows the sequence for: A) SEQ ID NO:1; B) SEQ ID NO:10;and C) SEQ ID NO:11.

DEFINITIONS

[0058] To facilitate understanding of the invention, a number of termsare defined below.

[0059] As used herein, “a conditional origin of replication” refers toan origin of replication that requires the presence of a functionaltrans-acting factor (e.g., a replication factor) in a prokaryotic hostcell. Conditional origins of replication include, but are not limitedto, temperature-sensitive replicons such as rep pSC101^(ts).

[0060] As used herein, the term “origin of replication” refers to anorigin of replication that is functional in a broad range of prokaryotichost cells (i.e., a normal or non-conditional origin of replication suchas the ColE1 origin and its derivatives).

[0061] The terms “sequence-specific recombinase” and “site-specificrecombinase” refer to enzymes that recognize and bind to a short nucleicacid site or sequence and catalyze the recombination of nucleic acid inrelation to these sites.

[0062] The terms “sequence-specific recombinase target site” and“site-specific recombinase target site” refer to a short nucleic acidsite or sequence which is recognized by a sequence- or site-specificrecombinase and which become the crossover regions during thesite-specific recombination event. Examples of sequence-specificrecombinase target sites include, but are not limited to, lox sites, frtsites, att sites and dif sites.

[0063] The term “lox site” as used herein refers to a nucleotidesequence at which the product of the cre gene of bacteriophage P1, Crerecombinase, can catalyze a site-specific recombination. A variety oflox sites are known to the art including the naturally occurring loxP(the sequence found in the P1 genome), loxB, loxL and loxR (these arefound in the E. coli chromosome) as well as a number of mutant orvariant lox sites such as loxP511, loxΔ86, loxΔ117, loxC2, loxP2, loxP3,loxP23, loxS, and loxH.

[0064] The term “frt site” as used herein refers to a nucleotidesequence at which the product of the FLP gene of the yeast 2 μm plasmid,FLP recombinase, can catalyze a site-specific recombination.

[0065] The term “unique restriction enzyme site” indicates that therecognition sequence for a given restriction enzyme appears once withina nucleic acid molecule. For example, the EcoRI site is a uniquerestriction enzyme site within the plasmid pUNI-10 (SEQ ID NO:1).

[0066] A restriction enzyme site is said to be located “adjacent to the3′ end of a sequence-specific recombinase target site” if therestriction enzyme recognition site is located downstream of the 3′ endof the sequence-specific recombinase target site. The adjacentrestriction enzyme site may, but need not, be contiguous with the lastor 3′ nucleotide comprising the sequence-specific recombinase targetsite. For example, the EcoRI site of pUNI-10 is located adjacent (within3 nucleotides) to the 3′ end of the loxP site (see FIG. 2B); the XhoI,NdeI, and NcoI sites are also adjacent (i.e., within about 10-150nucleotides) to the loxP site but these sites are not contiguous withthe 3′ end of the loxP site in pUNI-10.

[0067] The terms “polylinker” or “multiple cloning site” refer to acluster of restriction enzyme sites on a nucleic acid construct whichare utilized for the insertion and/or excision of nucleic acid sequencessuch as the coding region of a gene, lox sites, etc.

[0068] The term “prokaryotic termination sequence” refers to a nucleicacid sequence which is recognized by the RNA polymerase of a prokaryotichost cell and results in the termination of transcription. Prokaryotictermination sequences commonly comprise a GC-rich region that has atwofold symmetry followed by an AT-rich sequence [Stryer, supra]. Acommonly used prokaryotic termination sequence is the T7 terminationsequence. A variety of termination sequences are known to the art andmay be employed in the nucleic acid constructs of the present inventionincluding, but not limited to, the T_(INT), T_(L1), T_(L2), T_(L3),T_(R1), T_(R2), T_(6S) termination signals derived from thebacteriophage lambda [Lambda II, Hendrix et al. Eds., supra] andtermination signals derived from bacterial genes such as the trp gene ofE. coli [Stryer, supra].

[0069] The term “eukaryotic polyadenylation sequence” (also referred toas a “poly A site” or “poly A sequence”) as used herein denotes a DNAsequence which directs both the termination and polyadenylation of thenascent RNA transcript. Efficient polyadenylation of the recombinanttranscript is desirable as transcripts lacking a poly A tail areunstable and are rapidly degraded. The poly A signal utilized in anexpression vector may be “heterologous” or “endogenous.” An endogenouspoly A signal is one that is found naturally at the 3′ end of the codingregion of a given gene in the genome. A heterologous poly A signal isone which is isolated from one gene and placed 3′ of another gene. Acommonly used heterologous poly A signal is the SV40 poly A signal. TheSV40 poly A signal is contained on a 237 bp BamHI/BclI restrictionfragment and directs both termination and polyadenylation [J. Sambrook,supra, at 16.6-16.7]; numerous vectors contain the SV40 poly A signal[e.g., pCEP4, pREP4, pEBVHis (Invitrogen)]. Another commonly usedheterologous poly A signal is derived from the bovine growth hormone(BGH) gene; the BGH poly A signal is available on a number ofcommercially available vectors [e.g., pcDNA3.1, pZeoSV2, pSecTag(Invitrogen)]. The poly A signal from the Herpes simplex virus thymidinekinase (HSV tk) gene is also often used as a poly A signal on expressionvectors. Vectors containing the HSV tk poly A signal include thepBK-CMV, pBK-RSV, and pOP13CAT vectors from Stratagene.

[0070] As used herein, the terms “selectable marker” or “selectablemarker gene” refers to the use of a gene which encodes an enzymaticactivity that confers the ability to grow in medium lacking what wouldotherwise be an essential nutrient (e.g., the TRP1 gene in yeast cells).In addition, a selectable marker may confer resistance to an antibioticor drug upon the cell in which the selectable marker is expressed. Aselectable marker may be used to confer a particular phenotype upon ahost cell. When a host cell must express a selectable marker to grow inselective medium, the marker is said to be a positive selectable marker(e.g., antibiotic resistance genes which confer the ability to grow inthe presence of the appropriate antibiotic). Selectable markers can alsobe used to select against host cells containing a particular gene (e.g.,the sacB gene which, if expressed, kills the bacterial host cells grownin medium containing 5% sucrose and the ΦX174 E gene). Selectablemarkers used in this manner are referred to as negative selectablemarkers or counter-selectable markers.

[0071] As used herein, the term “vector” is used in reference to nucleicacid molecules that transfer DNA segment(s) from one cell to another.The term “vehicle” is sometimes used interchangeably with “vector.” A“vector” is a type of “nucleic acid construct.” The term “nucleic acidconstruct” includes circular nucleic acid constructs such as plasmidconstructs, phagemid constructs, cosmid vectors, etc. as well as linearnucleic acid constructs (e.g., λ phage constructs and PCR products). Thenucleic acid construct may comprise expression signals such as apromoter and/or an enhancer (in such a case it is referred to as anexpression vector).

[0072] The term “expression vector” as used herein refers to arecombinant DNA molecule containing a desired coding sequence andappropriate nucleic acid sequences necessary for the expression of theoperably linked coding sequence in a particular host organism. Nucleicacid sequences necessary for expression in prokaryotes usually include apromoter, an operator (optional), and a ribosome binding site, oftenalong with other sequences. Eukaryotic cells are known to utilizepromoters, enhancers, and termination and polyadenylation signals.

[0073] The terms “in operable combination,” “in operable order,” and“operably linked” as used herein refer to the linkage of nucleic acidsequences in such a manner that a nucleic acid molecule capable ofdirecting the transcription of a given gene and/or the synthesis of adesired protein molecule is produced. The term also refers to thelinkage of amino acid sequences in such a manner so that a functionalprotein is produced.

[0074] The terms “transformation” and “transfection” as used hereinrefer to the introduction of foreign DNA into prokaryotic or eukaryoticcells. Transformation of prokaryotic cells may be accomplished by avariety of means known to the art including the treatment of host cellswith CaCl₂ to make competent cells, electroporation, etc. Transfectionof eukaryotic cells may be accomplished by a variety of means known tothe art including calcium phosphate-DNA co-precipitation,DEAE-dextran-mediated transfection, polybrene-mediated transfection,electroporation, microinjection, liposome fusion, lipofection,protoplast fusion, retroviral infection, and biolistics, among othermeans.

[0075] As used herein, the terms “restriction endonucleases” and“restriction enzymes” refer to bacterial enzymes, each of which cutdouble-stranded DNA at or near a specific nucleotide sequence.

[0076] As used herein, the term “recombinant DNA molecule” as usedherein refers to a DNA molecule that comprises segments of DNA joinedtogether by means of molecular biological techniques.

[0077] The term “recombinant protein” or “recombinant polypeptide” asused herein refers to a protein molecule that is expressed from arecombinant DNA molecule.

[0078] DNA molecules are said to have “5′ ends” and “3′ ends” becausemononucleotides are reacted to make oligonucleotides in a manner suchthat the 5′ phosphate of one mononucleotide pentose ring is attached tothe 3′ oxygen of its neighbor in one direction via a phosphodiesterlinkage. Therefore, an end of an oligonucleotides is referred to as the“5′ end” if its 5′ phosphate is not linked to the 3′ oxygen of amononucleotide pentose ring and as the “3′ end” if its 3′ oxygen is notlinked to a 5′ phosphate of a subsequent mononucleotide pentose ring. Asused herein, a nucleic acid sequence, even if internal to a largeroligonucleotide, also may be said to have 5′ and 3′ ends. In either alinear or circular DNA molecule, discrete elements are referred to asbeing “upstream” or 5′ of the “downstream” or 3′ elements. Thisterminology reflects the fact that transcription proceeds in a 5′ to 3′fashion along the DNA strand. The promoter and enhancer elements thatdirect transcription of a linked gene are generally located 5′ orupstream of the coding region. However, enhancer elements can exerttheir effect even when located 3′ of the promoter element and the codingregion. Transcription termination and polyadenylation signals arelocated 3′ or downstream of the coding region.

[0079] The 3′ end of a promoter element is said to be located upstreamof the 5′ end of a sequence-specific recombinase target site when(moving in a 5′ to 3′ direction along the nucleic acid molecule) the 3′terminus of a promoter element (the transcription start site is taken asthe 3′ end of a promoter element) precedes the 5′ end of thesequence-specific recombinase target site. The 3′ end of the promoterelement may be located adjacent (generally within about 0 to 500 bp) tothe 5′ end of the sequence-specific recombinase target site. Such anarrangement is used when the pHOST vector is not intended to permit theexpression of a translational fusion with the gene of interest donatedby a pUNI vector. Alternatively, when the pHOST vector is intended topermit the expression of a translational fusion, the 3′ end of thepromoter element is located upstream of both the sequences encoding theamino-terminus of a fusion protein and the 5′ end of thesequence-specific recombinase target site. In this case, the 5′ end ofthe sequence-specific recombinase target site is located within thecoding region of the fusion protein (e.g., located downstream of boththe promoter element and the sequences encoding the affinity domain,such as Gst).

[0080] As used herein, the phrase “an oligonucleotide having anucleotide sequence encoding a gene” refers to a nucleic acid sequencecomprising the coding region of a gene or, in other words, the nucleicacid sequence that encodes a gene product. The coding region may bepresent in either a cDNA, genomic DNA, or RNA form. When present in aDNA form, the oligonucleotide may be single-stranded (i.e., the sensestrand) or double-stranded. Suitable control elements such asenhancers/promoters, splice junctions, polyadenylation signals, etc. maybe placed in close proximity to the coding region of the gene if neededto permit proper initiation of transcription and/or correct processingof the primary RNA transcript. Alternatively, the coding region utilizedin the vectors of the present invention may contain endogenousenhancers/promoters, splice junctions, intervening sequences,polyadenylation signals, etc. or a combination of both endogenous andexogenous control elements.

[0081] As used herein, the term “regulatory element” refers to a geneticelement that controls some aspect of the expression of nucleic acidsequences. For example, a promoter is a regulatory element thatfacilitates the initiation of transcription of an operably linked codingregion. Other regulatory elements are splicing signals, polyadenylationsignals, termination signals, etc. (defined infra).

[0082] Transcriptional control signals in eukaryotes comprise “promoter”and “enhancer” elements. Promoters and enhancers consist of short arraysof DNA sequences that interact specifically with cellular proteinsinvolved in transcription [Maniatis, T. et al., Science 236:1237(1987)]. Promoter and enhancer elements have been isolated from avariety of eukaryotic sources including genes in yeast, insect, andmammalian cells and viruses (analogous control elements, i.e.,promoters, are also found in prokaryotes). The selection of a particularpromoter and enhancer depends on what cell type is to be used to expressthe protein of interest. Some eukaryotic promoters and enhancers have abroad host range while others are functional in a limited subset of celltypes [for review, see Voss, S. D. et al., Trends Biochem. Sci., 11:287(1986) and Maniatis, T. et al., supra (1987)]. For example, the SV40early gene enhancer is very active in a wide variety of cell types frommany mammalian species and has been widely used for the expression ofproteins in mammalian cells [Dijkema, R. et al., EMBO J. 4:761 (1985)].Two other examples of promoter/enhancer elements active in a broad rangeof mammalian cell types are those from the human elongation factor 1αgene [Uetsuki, T. et al., J. Biol. Chem., 264:5791 (1989), Kim, D. W. etal., Gene 91:217 (1990) and Mizushima, S. and Nagata, S., Nuc. Acids.Res., 18:5322 (1990)] and the long terminal repeats of the Rous sarcomavirus [Gorman, C. M. et al., Proc. Natl. Acad. Sci. USA 79:6777 (1982)]and the human cytomegalovirus [Boshart, M. et al., Cell 41:521 (1985)].

[0083] As used herein, the term “promoter/enhancer” denotes a segment ofDNA that contains sequences capable of providing both promoter andenhancer functions (i.e., the functions provided by a promoter elementand an enhancer element, see above for a discussion of these functions).For example, the long terminal repeats of retroviruses contain bothpromoter and enhancer functions. The enhancer/promoter may be“endogenous” or “exogenous” or “heterologous.” An “endogenous”enhancer/promoter is one which is naturally linked with a given gene inthe genome. An “exogenous” or “heterologous” enhancer/promoter is onewhich is placed in juxtaposition to a gene by means of geneticmanipulation (i.e., molecular biological techniques) such thattranscription of that gene is directed by the linked enhancer/promoter.

[0084] The presence of “splicing signals” on an expression vector oftenresults in higher levels of expression of the recombinant transcript.Splicing signals mediate the removal of introns from the primary RNAtranscript and consist of a splice donor and acceptor site [Sambrook, J.et al., Molecular Cloning: A Laboratory Manual, 2nd ed., Cold SpringHarbor Laboratory Press, New York (1989) pp. 16.7-16.8]. A commonly usedsplice donor and acceptor site is the splice junction from the 16S RNAof SV40.

[0085] Eukaryotic expression vectors may also contain “viral replicons”or “viral origins of replication.” Viral replicons are viral DNAsequences that allow for the extrachromosomal replication of a vector ina host cell expressing the appropriate replication factors. Vectors thatcontain either the SV40 or polyoma virus origin of replication replicateto high copy number (up to 10⁴ copies/cell) in cells that express theappropriate viral T antigen. Vectors that contain the replicons frombovine papillomavirus or Epstein-Barr virus replicate extrachromosomallyat low copy number (˜100 copies/cell).

[0086] As used herein, the terms “nucleic acid molecule encoding,” “DNAsequence encoding,” and “DNA encoding” refer to the order or sequence ofdeoxyribonucleotides along a strand of deoxyribonucleic acid. The orderof these deoxyribonucleotides determines the order of amino acids alongthe polypeptide (protein) chain. The DNA sequence thus codes for theamino acid sequence.

[0087] As used herein, the term “gene” means the deoxyribonucleotidesequences comprising the coding region of a structural gene and theincluding sequences located adjacent to the coding region on both the 5′and 3′ ends such that the gene corresponds to the length of thefull-length mRNA. The sequences that are located 5′ of the coding regionand which are present on the mRNA are referred to as 5′ non-translatedsequences. The sequences that are located 3′ or downstream of the codingregion and which are present on the mRNA are referred to as 3′non-translated sequences. The term “gene” encompasses both cDNA andgenomic forms of a gene. A genomic form or clone of a gene contains thecoding region interrupted with non-coding sequences termed “introns” or“intervening regions” or “intervening sequences.” Introns are segmentsof a gene that are transcribed into nuclear RNA (mRNA); introns maycontain regulatory elements such as enhancers. Introns are removed or“spliced out” from the nuclear or primary transcript. Introns thereforeare absent in the messenger RNA (mRNA) transcript. The mRNA functionsduring translation to specify the sequence or order of amino acids in anascent polypeptide. When a gene is altered such that its product is nolonger biologically active in a wild-type fashion, the mutation isreferred to as a “loss-of-function” mutation. When a gene is alteredsuch that a portion or the entirety of the gene is deleted or replaced,the mutation is referred to as a “knockout” mutation.

[0088] In addition to containing introns, genomic forms of a gene mayalso include sequences located on both the 5′ and 3′ end of thesequences that are present on the RNA transcript. These sequences arereferred to as “flanking” sequences or regions (these flanking sequencesare located 5′ or 3′ to the non-translated sequences present on the mRNAtranscript). The 5′ flanking region may contain regulatory sequencessuch as promoters and enhancers that control or influence thetranscription of the gene. The 3′ flanking region may contain sequencesthat direct the termination of transcription, post-transcriptionalcleavage, and polyadenylation.

[0089] As used herein, the term “purified” or “to purify” refers to theremoval of contaminants from a sample. For example, recombinant Crepolypeptides are expressed in bacterial host cells (e.g., as a Gst-Crefusion protein) and the Cre polypeptides are purified by the removal ofat least a portion of the host cell proteins; the percent of recombinantCre polypeptides is thereby increased in the sample.

[0090] The term “native protein” is used herein to indicate that aprotein does not contain amino acid residues encoded by vectorsequences; that is the native protein contains only those amino acidsfound in the protein as it occurs in nature. A native protein may beproduced by recombinant means or may be isolated from a naturallyoccurring source.

[0091] As used herein the term “portion” when in reference to a protein(as in “a portion of a given protein”) refers to fragments of thatprotein. The fragments may range in size from four amino acid residuesto the entire amino acid sequence minus one amino acid.

[0092] As used herein, the term “fusion protein” refers to a chimericprotein containing the protein of interest (e.g., the Cre protein)joined to an exogenous protein fragment (e.g., the fusion partner whichconsists of non-Cre protein sequences). The fusion partner may enhancesolubility of the protein of interest as expressed in a host cell, mayprovide an affinity tag to allow purification of the recombinant fusionprotein from the host cell or culture supernatant, or both, among otherdesired characteristics. If desired, the fusion protein may be removedfrom the protein of interest by a variety of enzymatic or chemical meansknown to the art.

DESCRIPTION OF THE INVENTION

[0093] The present invention provides compositions and methods thatcomprise a system for the rapid subcloning of nucleic acid sequences invivo and in vitro without the need to use restriction enzymes. Thissystem is referred to as the Univector Fusion System or UnivectorPlasmid-fusion System (UPS). The UPS employs site-specific recombinationto catalyze plasmid fusion between a Univector (i.e., a plasmidcontaining a gene of interest) and host vectors containing regulatoryinformation. In some embodiments of the present invention, plasmidfusion events are genetically selected and result in placement of thegene of interest under the control of novel regulatory elements. Asecond UPS-related method of the present invention allows for theprecise transfer of coding sequences alone from a Univector into a hostvector. UPS further provides means for the subcloning of entire nucleicacid libraries and the directional cloning of linear nucleic acidmolecules (e.g., PCR products).

[0094] The UPS offers many advantages over previously availabletechnologies for the manipulation of genes. For example, for a routineanalysis of a new gene, it may be desirable to express it in bacteria asa glutathione-S-transferase (Gst) or polyhistidine fusion forpurification and antibody production, to fuse it to the DNA-bindingdomain of GAL4 or lexA for two hybrid analysis, to express it from theT7 promoter to allow generation of a riboprobe or mRNA for in vitrotranscription and translation, and express it in baculovirus, all in thecourse of a single study. One might also wish to express the gene underthe regulation of different promoters in a variety of organisms or tomark it with different epitope tags to facilitate subsequent biochemicalor immunological analysis. All of these manipulations consumesignificant amounts of time and energy using previous availabletechnologies for two reasons. First, each of the different vectorsrequired for these studies were, for the most part, developedindependently and thus contain different sequences and restriction sitesfor insertion of genes. Therefore, genes must be individually tailoredto adapt to each of these vectors. Secondly, the DNA sequence of anygiven gene varies and can contain internal restriction sites that makeit incompatible with particular vectors, thereby complicatingmanipulation. The advent of the polymerase chain reaction (PCR) hasgreatly facilitated the alteration of gene sequences and creation ofcompatible restriction sites for subcloning purposes. However, the higherror rate of thermostable polymerases requires the sequence of eachPCR-derived DNA fragment to be verified, a time consuming process.

[0095] The availability of whole genome sequences now provides theopportunity to analyze large sets of genes for both genetic andbiochemical properties. The need to perform parallel processing of largegene sets exponentially amplifies the current defects associated withconventional cloning methods. The methods and compositions of thepresent invention provide a series of recombination-based approachesthat significantly reduce the time and effort involved in generatingmultiple transcriptional and translational fusions for gene analysis andcDNA library construction. The present invention provides a systemwhereby a gene can be placed under the control of any of a variety ofpromoters or fused in frame to other proteins or peptides without theuse of restriction enzymes. As discussed above, the UPS usessite-specific recombination to fuse two plasmids at a unique sequenceadjacent to both a regulatory region and the 5′ end of the gene orinterest, thereby placing the gene under new regulation. This system,together with the other methods and compositions of the presentinvention discussed herein, provide a multifaceted approach for therapid and efficient generation and manipulation of recombinant DNA, thusmaking possible parallel processing of whole genome sets of codingsequences.

[0096] The basis of the UPS is a vector termed the “Univector” or the“pUNI” vector into which sequences encoding a gene of interest (cDNA orgenomic) are inserted. The pUNI vector has a sequence-specificrecombinase target site, such as a loxP site, preceding the insertionsite for the gene of interest, a selectable marker gene (this feature isoptional) and a conditional origin of replication that is active only inhost cells expressing the requisite trans-acting replication factor(this feature is optional). The pUNI vectors are designed to contain agene of interest but lack a promoter for the expression of the gene ofinterest. The gene of interest may be cloned directly into the pUNIvector (i.e., the pUNI vector may be used as a cloning vector,particularly for the cloning of cDNA libraries) or a previously clonedgene of interest may be inserted (i.e., subcloned) into the pUNI vector.

[0097] Using a sequence-specific recombinase (e.g., Cre recombinase), aprecise fusion of the pUNI vector into a second vector containinganother sequence-specific recombinase target site is catalyzed. Thesecond vector, referred to generically as a “PHOST” vector, is a vector(e.g., expression vector) that contains the sequence-specificrecombinase target site downstream of regulatory element (e.g., apromoter) contained within the pHOST vector. Following the site-specificrecombination event which occurs between the single sequence-specificrecombinase target sites located on each vector (e.g., the pUNI vectorand the pHOST vector), the two vectors are stably fused in a manner thatplaces the gene of interest under the control of the regulatory elementcontained within the pHOST vector. When used for transfer into anexpression vector, this fusion event also occurs in a manner thatretains the proper translational reading frame of the gene of interest.

[0098] In some embodiment of the present invention, the fusion orrecombination event can be selected for by selecting for the ability ofhost cells, which do not express a trans-acting replication factorrequired for replication of a conditional origin contained on the pUNIvector, to acquire a selectable phenotype conferred by the selectablemarker gene (if present) on the pUNI vector. In these embodiments, thepUNI vector cannot replicate in cells that do not express thetrans-acting replication factor and therefore, unless the pUNI vectorhas integrated into the second vector that contains a non-conditionalorigin of replication, pUNI will be lost from the host cell.

[0099] The Univector Fusion System allows any number of expression orfusion constructs containing the gene of interest present on the pUNIvector to be made rapidly (e.g., within a single day). Usingconventional cloning or subcloning techniques which employ restrictionenzyme digestion(s), the production of a single expression vectorcontaining a gene of interest can take several days (i.e., for thedesign and construction of each expression vector). In contrast, withthe methods and compositions of the present invention, once a battery ofexpression vectors modified to contain the appropriate sequence-specificrecombinase target site is made, a gene of interest can be transferredto any number of expression vectors in an afternoon using the UnivectorFusion System. For example, FIG. 1 provides a schematic illustrating thestraightforward recombination methods of the pUNI vectors and theUnivector Fusion System.

[0100] The present invention further provides methods and compositionsfor directional subcloning of PCR fragments and other nucleic acidmolecules into Univectors or other vectors and methods and compositionsfor generation of epitope tags and other fusions at the 3′ end of openreading frames using homologous recombination.

[0101] In general, UPS can be used to fuse any coding region of interesteither with a specific promoter to gain novel transcriptionalregulation, with another coding sequence to produce a fusion proteinwith novel properties (e.g., an epitope tag for immunological detectionor a DNA binding domain or transcriptional activation domain for twohybrid analysis), or with any other desired regulatory element. Asdiscussed above, the UPS eliminates the need for restriction enzymes,DNA ligases, and many in vitro manipulations required for subcloning.This relieves the constraints on cloning vectors with respect to DNAsequence and size since the UPS reaction is independent of vector sizeor sequence. Furthermore, the time-consuming processed inherent inconventional cloning such as the identification of a suitable vector,designing a cloning strategy, restriction endonuclease digestion,agarose gel electrophoresis, isolation of DNA fragments, and theligation reaction is shortened to a 20 minute UPS reaction. Due to theuniform nature of the UPS reaction and its simplicity, dozens ofconstructs can be made simultaneously by simply using differentrecipient vectors. In addition, in contrast to restriction enzymes andDNA ligases, recombinases (e.g., Gst-Cre) can be made inexpensively inlarge quantities. These features will save investigators significantamounts of time and expense.

[0102] Together, these methods constitute a comprehensiverecombinational strategy for the generation and manipulation ofrecombinant DNA that can be used for the parallel processing of genesets, an ability required for genomic analyses.

[0103] a) Conditional Origins of Replication and Suitable Host Cells

[0104] In some embodiments of the present invention, the pUNI vectorcomprises a conditional origin of replication. Conditional origins ofreplication are origins that require the presence or expression of atrans-acting factor in the host cell for replication. A variety ofconditional origins of replication functional in prokaryotic hosts(e.g., E. coli) are known to the art. The present invention isillustrated with, but not limited by, the use of the R6Kγ origin, oriR,from the plasmid R6K. The R6Kγ origin requires a trans-acting factor,the II protein supplied by the pir gene [Metcalf et al. (1996) Plasmid35:1]. E. coli strains containing the pir gene will support replicationof R6Kγ origins to medium copy number. A strain containing a mutantallele of pir, pir-116, will allow an even higher copy number ofconstructs containing the R6Kγ origin (ie., 15 copies per cell for thewild type versus 250 copies per cell for the mutant). This property maybe useful when potentially toxic genes are manipulated, although thechances of expression of a toxic gene are low because, in preferredembodiments of the present invention, the Univector either contains nopromoter or contains a promoter driving the neo gene which istranscribed in the opposite direction from the gene of interest.

[0105]E. coli strains that express the pir or pir-116 gene productinclude BW18815 (ATCC 47079;this strain contains the pir-116 gene),BW19094 (ATCC 47080; this strain contains the pri gene), BW20978 (thisstrain contains the pir-116 gene), BW20979 (this strain contains the pirgene), BW21037 (this strain contains the pir-116 gene) and BW21038 (thisstrain contains the pir gene) (Metcalf et al., supra).

[0106] Other conditional origins of replication suitable for use :on thepUNI vectors of the present invention include, but are not limited to:

[0107] 1) the RK2 oriV from the plasmid RK2 (ATCC 37125). The RK2 oriVrequires a trans-acting protein encoded by the trfA gene [Ayres et al.(1993) J. Mol. Biol. 230:174];

[0108] 2) the bacteriophage P1 ori which requires the repA protein forreplication [Pal et al. (1986) J. Mol. Biol. 192:275];

[0109] 3) the origin of replication of the plasmid pSC101 (ATCC 37032)which requires a plasmid encoded protein, repA, for replication [Sugiuraet al. (1992) J. Bacteriol. 175: 5993]. The pSC101 ori also requiresIHF, an E. coli protein. E. coli strains carrying the himA and himD(hip) mutants (the him and hip genes encode subunits of IHF) cannotsupport pSC101 replication [Stenzel et al. (1987) Cell 49:709];

[0110] 4) the bacteriophage lambda ori which requires the lambda O and Pproteins [Lambda II, Hendrix et al. Eds., Cold Spring Harbor Press, ColdSpring Harbor, N.Y. (1983)];

[0111] 5) pBR322 and other ColE1 derivatives will not replicate in polAmutants of E. coli and therefore, these origins of replication can beused in a conditional manner [Grindley and Kelley (1976) Mol. Gen.Genet. 143:311]; and

[0112] 6) replication-thermosensitive plasmids such pSU739 or pSU300which contain a thermosensitive replicon derived from plasmid pSC101,rep pSC101^(ts) which comprises oriV [Mendiola and de la Cruz (1989)Mol. Microbiol. 3:979 and Francia and Lobo (1996) J. Bact. 178:894].pSU739 and pSU300 are stably maintained in E. coli strain DH5α (GibcoBRL) at a growth temperature of 30° C. (42° C. is non-permissive forreplication of this replicon).

[0113] Other conditional origins of replication, including othertemperature sensitive replicons, are known to the art and may beemployed in the vectors and methods of the present invention.

[0114] b) Sequence-Specific Recombinases and Target Recognition Sites

[0115] The precise fusion between the pUNI vector and the expressionvector is catalyzed by a site-specific recombinase. Site-specificrecombinases are enzymes that recognize a specific DNA site or sequence(referred to herein generically as a “sequence-specific recombinasetarget site”) and catalyze the recombination of DNA in relation to thesesites. Site-specific recombinases are employed for the recombination ofDNA in both prokaryotes and eukaryotes. Examples of site-specificrecombination include, but are not limited to: 1) chromosomalrearrangements that occur in Salmonella typhimurium during phasevariation, inversion of the FLP sequence during the replication of theyeast 2 μm circle, and in the rearrangement of immunoglobulin and T cellreceptor genes in vertebrates, 2) integration of bacteriophages into thechromosome of prokaryotic host cells to form a lysogen, and 3)transposition of mobile genetic elements (e.g., transposons) in bothprokaryotes and eukaryotes. The term “site-specific recombinase” refersto enzymes that recognize short DNA sequences that become the crossoverregions during the recombination event and includes recombinases,transposases, and integrases.

[0116] The present invention is illustrated with, but not limited by,the use of vectors containing lox sites (e.g., loxP sites) and therecombination of these vectors using the Cre recombinase ofbacteriophage P1. The Cre protein catalyzes recombination of DNA betweentwo loxP sites and is involved in the resolution of P1 dimers generatedby replication of circular lysogens [Sternberg et al. (1981) Cold SpringHarbor Symp. Quant. Biol. 45:297]. Cre can function in vitro and in vivoin many organisms including, but not limited to, bacteria, fungi, andmammals [Abremski et al. (1983) Cell 32:1301;Sauer (1987) Mol. Cell.Biol. 7:2087; and Orban et al. (1992) Proc. Natl. Acad. Sci. 89:6861]. Aschematic for one embodiment of Cre-mediated plasmid fusion is shown inFIG. 14. In this figure, the Univector, pUNI, is the plasmid into whichthe gene of interest is inserted and pHOST represents the recipientvector that contains the appropriate transcriptional and/ortranslational regulatory sequences that will eventually control theexpression of the gene of interest. A recombinant expression constructis made through Cre-loxP-mediated site-specific recombination that fusesthese two plasmids. This in vitro reaction generates a dimericrecombinant plasmid in which the gene of interest from pUNI is placeddownstream of the promoter present on the host vector. In this example,the recombinant plasmid in FIG. 14 can be selected in a pir⁻ bacterialstrain by selecting Kn^(r).

[0117] The loxP sites may be present on the same DNA molecule or theymay be present on different DNA molecules; the DNA molecules may belinear or circular or a combination of both. The loxP site consists of adouble-stranded 34 bp sequence (SEQ ID NO:12) which comprises two 13 bpinverted repeat sequences separated by an 8 bp spacer region [Hoess etal. (1982) Proc. Natl. Acad. Sci. USA 79:3398 and U.S. Pat. No.4,959,317, the disclosure of which is herein incorporated by reference].The internal spacer sequence of the loxP site is asymmetrical and thus,two loxP sites can exhibit directionality relative to one another [Hoesset al. (1984) Proc. Natl. Acad Sci. USA 81:1026]. When two loxP sites onthe same DNA molecule are in a directly repeated orientation, Creexcises the DNA between these two sites leaving a single loxP site onthe DNA molecule [Abremski et al. (1983) Cell 32:1301]. If two loxPsites are in opposite orientation on a single DNA molecule, Cre invertsthe DNA sequence between these two sites rather than removing thesequence. Two circular DNA molecules each containing a single loxP sitewill recombine with one another to form a mixture of monomer, dimer,trimer, etc. circles. The concentration of the DNA circles in thereaction can be used to favor the formation of monomer (lowerconcentration) or multimeric circles (higher concentration).

[0118] Circular DNA molecules having a single loxP site will recombinewith a linear molecule having a single loxP site to produce a largerlinear molecule. Cre interacts with a linear molecule containing twodirectly repeating loxP sites to produce a circle containing thesequences between the loxP sites and a single loxP site and a linearmolecule containing a single loxP site at the site of the deletion.

[0119] The Cre protein has been purified to homogeneity [Abremski et al.(1984) J. Mol. Biol. 259:1509] and the cre gene has been cloned andexpressed in a variety of host cells [Abremski et al. (1983), supra].Purified Cre protein is available from a number of suppliers (e.g.,Novagen and New England Nuclear/DuPont).

[0120] The Cre protein also recognizes a number of variant or mutant loxsites (variant relative to the loxP sequence), including the loxB, loxLand loxR sites which are found in the E. coli chromosome [Hoess et al.(1982), supra]. Other variant lox sites include loxP511[5′-ATAACTTCGTATAGTATACATTATACGAAGTTAT-3′ (SEQ ID NO:16); spacer regionunderlined; Hoess et al. (1986), supra], and loxC2 [5′-ACAACTTCGTATAATGTATGCTATACGAAGTTAT-3′ (SEQ ID NO:17); spacer regionunderlined; U.S. Pat. No. 4,959,317]. Cre catalyzes the cleavage of thelox site within the spacer region and creates a six base-pair staggeredcut [Hoess and Abremski (1985) J. Mol. Biol. 181:351]. The two 13 bpinverted repeat domains of the lox site represent binding sites for theCre protein. If two lox sites differ in their spacer regions in such amanner that the overhanging ends of the cleaved DNA cannot reanneal withone another, Cre cannot efficiently catalyze a recombination event usingthe two different lox sites. For example, it has been reported that Crecannot recombine (at least not efficiently) a loxP site and a loxP511site; these two lox sites differ in the spacer region. Two lox siteswhich differ due to variations in the binding sites (i.e., the 13 bpinverted repeats) may be recombined by Cre provided that Cre can bind toeach of the variant binding sites. The efficiency of the reactionbetween two different lox sites (varying in the binding sites) may beless efficient that between two lox sites having the same sequence (theefficiency will depend on the degree and the location of the variationsin the binding sites). For example, the loxC2 site can be efficientlyrecombined with the loxP site, as these two lox sites differ by a singlenucleotide in the left binding site.

[0121] A variety of other site-specific recombinases may be employed inthe methods of the present invention in place of the Cre recombinase.Alternative site-specific recombinases include, but are not limited to:

[0122] 1) the FLP recombinase of the 2 μ plasmid of Saccharomycescerevisiae [Cox (1983) Proc. Natl. Acad. Sci. USA 80:4223] whichrecognizes the frt site. Like the loxP site, the frt site comprises two13 bp inverted repeats separated by an 8 bp spacer[5′-GAAGTTCCTATTCTCTAGAAAGTATAGGAACTTC-3′ (SEQ ID NO:18); spacerunderlined]. The FLP gene has been cloned and expressed in E. coli (Cox,supra) and in mammalian cells (PCT International Patent ApplicationPCT/US92/01899, Publication No.: WO 92/15694, the disclosure of which isherein incorporated by reference) and has been purified [Meyer-Lean etal. (1987) Nucleic Acids Res. 15:6469; Babineau et al. (1985) J. Biol.Chem. 260:12313; and Gronostajski and Sadowski (1985) J. Biol. Chem.260:12328];

[0123] 2) the Int recombinase of bacteriophage lambda (with or withoutXis) which recognizes att sites (Weisberg et al. In: Lambda II, supra,pp. 211-250);

[0124] 3) the xerC and xerD recombinases of E. coli which together forma recombinase that recognizes the 28 bp dif site [Leslie and Sherratt(1995) EMBO J. 14:1561];

[0125] 4) the Int protein from the conjugative transposon Tn916 [Lu andChurchward (1994) EMBO J. 13:1541];

[0126] 5) TpnI and the β-lactamase transposons [Levesque (1990) J.Bacteriol. 172:3745];

[0127] 6) the Tn3 resolvase [Flanagan et al. (1989) J. Mol. Biol.206:295 and Stark et al. (1989) Cell 58:779];

[0128] 7) the SpoIVC recombinase of Bacillus subtilis [Sato et al.(1990) J. Bacteriol. 172:1092];

[0129] 8) the Hin recombinase [Galsgow et al. (1989) J. Biol. Chem.264:10072];

[0130] 9) the Cin recombinase [Hafter et al. (1988) EMBO J. 7:3991]; and

[0131] 10) the immunoglobulin recombinases [Malynn et al. Cell (1988)54:453].

[0132] c) Modification of Expression Vectors

[0133] As discussed above, pUNI vectors are used to transfer a gene ofinterest into a suitably modified vector via site-specificrecombination. The modified vectors or host vectors used in theUnivector Fusion System are referred to as pHOST vectors. pHOST vectorsare generally expression vectors (e.g., plasmids) which have beenmodified by the insertion of a sequence-specific recombinase target site(e.g., a lox site). However, the pHOST can comprise any regulatorysequence desired for manipulation of nucleic acids. The presence of thesequence-specific recombinase target site on the pHOST plasmid permitsthe rapid subcloning or insertion of the gene interest contained withina pUNI vector to generate an expression vector capable of expressing thegene of interest. In some embodiments of the present invention, thepHOST vector may encode a protein domain such as an affinity domainincluding, but not limited to, glutathione-S-transferase (Gst), maltosebinding protein (MBP), a portion of staphylococcal protein A (SPA), apolyhistidine tract, etc. A variety of commercially available expressionvectors encoding such affinity domains are known to the art. Theaffinity domain may be located at either the amino- or carboxy-terminusof the fusion protein. When the pHOST plasmid contains a vector-encodedaffinity domain, a fusion protein comprising the vector-encoded affinitydomain and the protein of interest is generated when the pUNI and pHOSTvectors are recombined.

[0134] To generate expression vectors intended to generatetranscriptional fusions (i.e., pHOST does not contain a vector-encodedprotein domain), a sequence-specific recombinase target site is placedafter (i.e., downstream of) the start of transcription in the hostvector. This is easily accomplished using synthetic oligonucleotidescomprising the desired sequence-specific recombinase target site. Indesigning the oligonucleotide comprising the sequence-specificrecombinase target site, care is taken to avoid introducing an ATG orstart codon that might initiate translation inappropriately.

[0135] To generate expression vectors intended to generate a fusionprotein between a vector-encoded protein domain located at theamino-terminus of the fusion protein and the protein of interest(encoded by the gene of interest contained within the pUNI vector)(i.e., a translational fusion), care is taken to place thesequence-specific recombinase target site in the correct reading framesuch that: 1) an open reading frame is maintained through thesequence-specific recombinase target site on pHOST, and 2) the openreading frame in the sequence-specific recombinase target site on pHOSTis in frame with the open reading frame found on the sequence-specificrecombinase target site contained within the pUNI vector. In addition,the oligonucleotide comprising the sequence-specific recombinase targetsite on pHOST is designed to avoid the introduction of in-frame stopcodons. The gene of interest contained within the pUNI vector is clonedin a particular reading frame so as to facilitate the creation of thedesired fusion protein.

[0136] The modification of several expression vectors is provided in theexamples below to illustrate the creation of suitable pHOST vectors. Atpresent, approximately 40 pHOST vectors have been generated, includingGST expression vectors, yeast GAL1 expression vectors, mammalian CMVexpression vectors, and baculovirus expression vectors. In each case,expression was at or near the levels achieved by conventional cloning. Ageneral strategy for generating any pHOST of interest involves thegeneration of a linker containing the desired sequence-specificrecombinase target site (e.g., a lox site such as loxP or loxH) byannealing two complementary oligonucleotides. The annealedoligonucleotides form a linker having sticky ends that are compatiblewith ends generated by restriction enzymes whose sites are convenientlylocated in the parental expression vector (e.g., within a polylinker ofthe parental expression vector). Thus, any vector can be easily adaptedfor use with the UPS method.

[0137] d) In Vitro Recombination

[0138] The fusion of a pUNI vector and a pHOST vector is accomplished invitro using a purified preparation of a site-specific recombinase (e.g.,Cre recombinase). The pUNI vector and the pHOST vector are placed inreaction vessel (e.g., a microcentrifuge tube) in a buffer compatiblewith the site-specific recombinase to be used. For example, when a Crerecombinase (native or a fusion protein form) is employed, the reactionbuffer may comprise 50 mM Tris-HCl (pH 7.5), 10 mM MgCl₂, 30 mM NaCl and1 mg/ml BSA. When a FLP recombinase is employed, the reaction buffer maycomprise 50 mM Tris-HCl (pH 7.4), 10 MM MgCl₂, 100 μg/ml BSA[Gronostajski and Sadowski, supra]. The concentration of the pUNI vectorand the pHOST vector may vary between 100 ng to 1.0 μg of each vectorper 20 μl reaction volume with about 0.1 μg of each nucleic acidconstruct (0.2 μg total) per 20 μl reaction being preferred. Theconcentration of the site-specific recombinase may be titered under astandard set of reaction conditions to find the optimal concentration ofenzyme to be used as described in Example 4.

[0139] Following the in vitro fusion reaction, a portion of the reactionmixture is used to transform a suitable host cell to permit the recoveryand propagation of the fused vectors. In some embodiments of the presentinvention, the host cell employed will not express the trans-actingfactor required for replication of the conditional origin of replicationcontained within the pUNI vector (or alternatively the host cell will begrown at a temperature which is non-permissive for replication of atemperature sensitive replicon contained within the pUNI vector). Thehost cells will be grown under conditions that select for the presenceof the selectable marker contained within the pUNI vector (e.g., growthin the presence of kanamycin when the pUNI vector contains a kanamycinresistance gene). Plasmid or non-chromosomal DNA is isolated from hostcells which display the desired phenotype and subjected to restrictionenzyme digestion to confirm that the desired fusion event has occurred.

[0140] e) Recombination in Prokaryotic Host Cells

[0141] The fusion of a pUNI vector and a pHOST vector may beaccomplished in vivo using a host cell that expresses the appropriatesite-specific recombinase (e.g., Cre recombinase). The host cell mayexpress the recombinase as part of its genome or may be supplied withmeans for expressing the recombinase (e.g., a recombinase expressionvector). In embodiments of the present invention that employ a pUNIvector with a conditional origin of replication, the host cell employedlack the ability to express the trans-acting factor required forreplication of the conditional origin of replication (or alternativelythe host cell will be grown at a temperature which is non-permissive forreplication of a temperature sensitive replicon contained within thepUNI vector).

[0142] The pUNI vector and the pHOST vector are cotransformed into thehost cell using a variety of methods known to the art (e.g.,transformation of cells made competent by treatment with CaCl₂,electroporation, etc.). The cotransformed host cells are grown underconditions that select for the presence of the selectable markercontained within the pUNI vector (e.g., growth in the presence ofkanamycin when the pUNI vector contains the kanamycin resistance gene).Plasmid or non-chromosomal DNA is isolated from host cells which displaythe desired phenotype and subjected to restriction enzyme digestion toconfirm that the desired fusion event has occurred.

[0143] f) Precise ORF Transfer (POT)

[0144] UPS results in the fusion of two plasmids and is suitable for thevast majority of expression needs. In rare cases where the size of therecombinant molecule is limiting (e.g., in the generation of retrovirusor adeno-associated viral [AAV] expression constructs), it might bedesirable to transfer only the gene of interest and not theapproximately 2 kb remainder of the Univector. To accomplish this, asecond recombination event is utilized. In some embodiments of thepresent invention, this second recombination is catalyzed by the Rrecombinase [Araki et al. (1992) J. Mol. Biol. 225:25] that allows aresolution of the UPS generated heterodimer as described in Example 9,although a variety of second recombinases will find use with the presentinvention (e.g., the Res system). POT function in vivo and in vitro. Itis recommended that POT only be used in those cases where size is alimitation.

[0145] In some embodiments of the present invention, a standard UPSmethod is utilized to generate a dimer containing the entire pUNI andPHOST vectors, followed by a reaction with the second recombinase thatexcises the unwanted portions of the Univector. Alternatively, hostcells or reaction conditions can be applied that allow bothrecombination reactions to occur in a single step (See Example 9). Cellscontaining the desired recombinant product can be selected for by usingselectable markers, and/or conditional origins of replication.

[0146] g) Generation of 3′ Gene Fusions on the Univector

[0147] While UPS greatly facilitates the generation of fusion proteinsat the N-terminus of the protein of interest, it is often necessary tomodify proteins on the C-terminus (e.g., to add an epitope tag). Tofacilitate this class of modification, the present invention takesadvantage of E. coli's endogenous homologous recombination system. Ithas been shown [Winans et al. (1985) J. Bacteriol. 161:1219] that E.coli strains mutant for recBC, but containing a suppressor sbc, couldtake up linear DNA and recombine it onto the E. coli chromosome orresident plasmids, much as has been shown for S. cerevisiae. recDmutants have been shown to behave in a similar manner [Russell et al.(1989) J. Bacteriol. 171:2609]. However, such systems have not been usedfor recombinant cloning in E. coli. In fact, these systems areincompatible with many cloning protocols, as the endogenous restrictionmodification systems of the cell would digest the samples to be cloned.

[0148] The present invention provides means to overcome these problemsand to provide for effective cloning and recombination (e.g., with theUPS). To facilitate recombination onto Univector plasmids, the presentinvention provides BUN10, a recBCsbcBhsdR strain expressing pir-116. ThehsdR mutation prevents restriction of nucleic acid (e.g., PCR amplifiedDNA) by the endogenous restriction modification system of E. coli. Inone embodiment of the present invention, this system was tested using a3×MYC epitope tag and the SKP1 gene in pUNI-10 as the recipient. pML74,which is pUNI-Amp containing a triple (3×) MYC epitope tag followed by astop codon, was used as template DNA for PCR amplification with twoprimers, A and B. Primer A (SEQ ID NO:30) is 71 nt long, the first 50 ntof which correspond to the last 50 nt of the SKP1 coding region and thelast 21 nt, the 3′ end of the primer, correspond to the first 21 nt ofthe DNA encoding the 3×MYC tag. The reading frames of SKP1 and the 3×MYCtag are in register. Primer B (SEQ ID NO:31) is 22 nt long andrecognizes a site on pML74 common to pUNI vectors that begins 367 bpfrom the polylinker region. Amplification using primers A and B andpML74 as a template generated a fragment of DNA with 50 bp homology tothe Univector. This amplification product was co-transformed withBamH1-Sac1-cleaved pUNI-SKP1 into BUN10 cells and Kn^(r) transformantswere selected and analyzed by restriction mapping. Homologousrecombination events are selected because they allow therecircularization of the linearized vector. A schematic representationof this method is provided in FIG. 25. Ten percent of Kn^(r)transformants resulted in homologous recombination at the C-terminus ofthe SKP1 gene to generate a SKP1-3×MYC tag. This experiment demonstratesthat homologous recombination in E. coli can be used to alter thesequence of genes in 3′ regions adjacent to restriction sites.

[0149] Furthermore, it is clear that this method is generally applicableto broader cloning strategies. Although the example above describes theuse of an amplification product for recombination into the pUNI vector,any nucleic acid sample with sufficient sequence complementarity can beused. Thus, the sample to be inserted could be artificially synthesizedor prepared by any other means. Additionally, the recombination eventcan be designed to occur at any desired location on any desiredrecipient vector (i.e., is not limited to the production of 3′ genefusions).

[0150] h) Method for Directional Subcloning Into pUNI Vectors

[0151] When cloning blunt ended nucleic acid molecules, such as thosegenerated by thermostable polymerases, it is desirable to have a way ofidentifying desired recombinant molecules (e.g., vectors containing theinsert in a desired orientation). This is of great relevance to the UPSbecause the initial cloning of genes into pUNI will often utilize PCRamplified material. To facilitate this process, the present inventionprovides a method for directional subcloning into vectors (e.g., pUNIderivatives) that relies upon the generation of a reconstitutedregulatory element from two partial sites located on the fragment to becloned and the recipient vector, respectively. For example, a linearnucleic acid molecule to be inserted into a vector can be designed witha portion of a promoter at its 3′ or 5′ ends. The recipient vector isthen designed with the remainder of the promoter, arranged such that,when the cloned fragment is inserted in the desired direction, an intactpromoter is reconstituted and provides a means of detecting thesuccessful directional cloning event.

[0152] It is clear that a variety of reconstituted regulatory elementscan be employed to achieve detectable directional cloning. For example,reconstituted regulatory elements that find use with the presentinvention include, but are not limited to, promoters, repressors,operators, enhancers, enzyme recognitions sites, selectable markers, andconditional origins of replication, among others. It is alsocontemplated that the reconstituted regulatory element may comprise anegative selection capability, such that fragments cloned in anundesired orientation reconstitute the regulatory element and areselected against. One skilled in the art will recognize the wide rangeof regulatory elements and applications that can be applied to thissystem.

[0153] To demonstrate the effectiveness of the above approach, the lacoperator was employed to direct directional subcloning events. Luria andcolleagues observed in the early 1960s that phage carrying the bindingsite for the lac repressor, lacO, could induce the expression of theendogenous lacZ gene by titrating out a limited number of repressorproteins [Miller and Reznikoff, Eds. (1978) The Operon, Cold SpringHarbor Laboratory, Cold Spring Harbor, N.Y.] and this was shown to betrue when lacO was present on high copy number plasmids [Marians et al.(1976) Nature 263:744; and Heyneker et al. (1976) Nature 263:748], asillustrated in FIG. 22A. FIG. 22A shows a schematic representation ofnormal conditions in the absence of inducer (left diagram) where lacR isbound to the lac operator sites in front of lacZ and repressestranscription. In the presence of high copy number plasmid containingthe lacO sequence (right diagram), LacR repressors are titrated out bybinding to plasmid borne lacO sites and the endogenous lacZ gene isexpressed.

[0154] This observation was taken advantage of by the methods of thepresent invention, whereby the 3′ half of a lacO site was placed on apUNI vector (i.e., pUNI-30). The lacO derivative used was a symmetrical20 bp site that has a Eco47III site at the center. To utilize thismethod for cloning PCR derived material, primers were made correspondingto the SKP1 gene. A 10 bp sequence corresponding to the 5′ half of thesymmetrical lacO sequence (shown in FIG. 22B) was added to the 5′ end ofthe 3′ primer. FIG. 22B shows this strategy, whereby primer A (5′) and B(3′) are used to amplify the gene of interest. The 5′ end of primer Bcontains a half lacO site which subsequently becomes the 3-end of thePCR fragment indicated in the Figure. After ligating the PCR fragmentinto linearized pUNI-30 containing the other half of lacO, an intactlacO site is reconstituted and, in Lac⁺ cells, results in induction ofendogenous β-galactosidase and production of blue colonies in thepresence of X-Gal. The PCR fragment was ligated into Eco47III-cleavedpUNI-30 and transformed into BUN10, a Lac⁺ E. coli strain, and Kn^(r)colonies were selected on plates containing X-gal. Plasmids containingSKP1 in the proper orientation were identified by their dark blue color(shown by arrows in FIG. 22C). Reclosure of the vector without insert aswell as the presence of the PCR fragment in the incorrect orientationresult in the production of white or pale blue colonies. Ten out of 10dark blue colonies contained SKP1 in the correct orientation. Inparticularly preferred embodiments, phosphorylated PCR primers are used.In other preferred embodiments, Taq polymerase is used, and the materialis preferably treated briefly with T4 polymerase and dNTPs to remove the3′ overhangs generated.

[0155] i) Library Transfer Using UPS

[0156] In addition to permitting the rapid transfer of a gene ofinterest from a particular pUNI vector containing a gene of interestinto a pHOST vector, the Univector Fusion System permits the rapidexchange of an entire cDNA library to a variety of expression vectors.This capability to essentially transform one library into many librariesis one of the most significant advances made possible by the UPS methodsprovided by the present invention. The high efficiency of the in vitroUPS reaction (i.e., a minimum of 16.8%) coupled with the extremely highefficiency of modern transformation methods makes possible theconversion of whole cDNA libraries constructed in the Univector intoexpression libraries without loss of representation. Thus, it iscontemplated that single cDNA libraries will be converted into any of anumber of different expression libraries such as those used in the twohybrid systems [Durfee et al. (1993) Gene. & Dev. 7:55; and Aronheim etal. (1997) Mol. Cell. Biol. 17:3094], for complementation cloning inyeast [Elledge et al. (1991) Proc. Natl. Acad Sci. 88:1731], mammalianexpression systems [Okayama and Berg (1982) Mol. Cell. Biol. 2:161],etc. Thus, the present invention provides methods such that librariesmade for one purpose will no longer need to be remade from scratch whenneeded in a different context; clones isolated from these libraries areeasily converted back into simple Univector plasmids compatible withother pHOST vectors for future analysis.

[0157] In these methods, the cDNA library is generated using a pUNIvector as the cloning vector (a pUNI library). The entire library maythen be transferred (using either an in vitro or an in vivorecombination reaction) into any expression vector modified to contain asequence-specific recombinase target site (e.g., a lox site) (i.e., intoa pHOST vector). This solves an existing problem in the art, in thatthere is no way, using existing vector systems, to exchange the insertsin a library made in one expression vector en masse (i.e., as an entirelibrary) to a different expression vector. Example 10 provides anillustration of such capabilities using methods of the presentinvention.

[0158] In addition, the sequences contained within a pUNI library can beused to recombine with linear λ constructs (which can then be used toisolate specific genes by complementation of appropriate host cell suchas E. coli or S. cerevisiae mutant cells). For example, UPS iscompatible with the λ YES series of lambda cloning vectors that usecre-lox recombination to convert phage clones into plasmids. Thesevectors are capable of making extremely large cDNA libraries (i.e.,greater than 10⁸ recombinants per 100 ng of cDNA) and, unlike plasmidlibraries, can be propagated with minimal loss of representation.Further as described in Example 7, the in vivo gene trap method, avariation of the Univector Fusion System, can be used to transfer linearDNA fragments that lack a selectable marker, such as a PCR product, intoa variety of expression vectors.

[0159] An extremely important application of the UPS method is in themanipulation of whole genome sets of coding regions. For organisms whosegenomes have been sequenced, a complete set of identified ORFS, or“Unigene” set, can be constructed in the Univector and be systematicallyconverted by UPS into any kind of expression library. Also, thesimplicity and uniformity of the UPS reaction makes it readily amenableto automation for systematic conversion of arrayed clones. This greatlyexpedites the functional characterization of whole genomes and helpfurther the progression of genome projects into proteome projects.

Experimental

[0160] The following examples serve to illustrate certain preferredembodiments and aspects of the present invention and are not to beconstrued as limiting the scope thereof.

[0161] In the experimental disclosure which follows, the followingabbreviations apply: ° C. (degrees Centigrade); g (gravitational field);vol (volume); DNA (deoxyribonucleic acid); RNA (ribonucleic acid); kdalor kD (kilodaltons); OD (optical density); EDTA (ethylene diaminetetra-acetic acid); E. coli (Escherichia coli); SDS (sodium dodecylsulfate); PAGE (polyacrylamide gel electrophoresis); ts (temperaturesensitive); p (plasmid); LB (Luria-Bertani medium: per liter: 10 gBacto-tryptone, 5 g yeast extract, 10 g NaCl, pH to 7.5 with NaOH); ml(milliliter); μl (microliter); M (Molar); mM (millimolar); μM(microMolar); g (gram); μg (microgram); ng (nanogram); U (units), mU(milliunits); min. (minutes); sec. (seconds); % (percent); bp (basepair); kb (kilobase); PCR (polymerase chain reaction); Tris(tris(hydroxymethyl)-aminomethane); PMSF (phenylmethylsulfonylfluoride);BSA (bovine serum albumin); IPTG (isopropyl-β-D-thiogalactoside); ORF(open reading frame); ATCC (American Type Culture Collection, Rockville,Md.); Bio-Rad (Bio-Rad Corp., Hercules, Calif.); Invitrogen (Invitrogen,Corp., San Diego, Calif.); New England Nuclear/Du Pont (Boston, Mass.);Novagen (Novagen, Inc., Madison, Wis.); Pharmacia or Pharmacia Biotech(Pharmacia Biotech, Piscataway, N.J.); Pharmingen (PharMingen, SanDiegi, Calif.); Gibco BRL (Gaithersburg, Md.); and Stratagene(Stratagene Cloning Systems, La Jolla, Calif.).

EXAMPLE 1 Construction of Univector Constructs

[0162] In this example, illustrative Univector constructs are provided.The map for several Univectors is shown in FIG. 23, showing pUNI-10,pUNI-20, and pUNI-30. In this figure, nucleotide positions (inparentheses) of unique restriction enzyme cleavage sites are shown.Functional sequences are shown as filled boxes and are labeled inside ofthe circle. Boxes with arrows are genes transcribed in the direction ofthe arrow. Below each map is the sequence of the polylinker regiondisplayed as coding triplets in frame with the open reading frame ofloxP. Unique restriction enzyme cleavage sites are in bold. Generalfeatures of these Univectors include a loxP site placed adjacent to the5′ end of a polylinker for insertion of cDNAs. loxP has a single openreading frame that is in frame with the ATG of the NdeI and NcoI sitesof the polylinker. This facilitates the subsequent generation of proteinfusions as noted below. Following the polylinker are bacterial andeukaryotic transcriptional terminators to facilitate 3′ end formation oftranscripts. The Univectors also comprise a conditional origin orreplication derived from R6Kγ that allows their propagation only inbacterial hosts expressing the pir gene originally from R6Kγ [Metcalf etal. (1994) Gene 138:1]. The Univectors also have the neo gene from Tn5for selection in bacteria (e.g., selection of recombinant products ofUPS is achieved by selecting for kanamycin resistance aftertransformation into a pir⁻ strain because the neo gene on the pUNI canonly be propagated when covalently linked to an origin or replicationthat is functional in a pir⁻ background). pUNI-20 contains additionalsite specific recombination sites, such as RS, that facilitate preciseORF transfer (POT), as described below.

[0163] One Univector construct, the pUNI-10 vector, contains a loxPsite, a kanamycin resistance gene (Kn^(R)) and the R6Kγ conditionalorigin of replication (OriR_(R6Kγ)). The OriR_(R6Kγ) is functional onlyin E. coli strains expressing the II replication protein (i.e., theproduct of the pir gene). A gene of interest is placed within pUNI-10(either as a result of constructing a library in pUNI-10 or bysubcloning a previously cloned gene of interest). Once the gene ofinterest is contained within pUNI-10, any number of plasmid expressionconstructs containing this gene of interest can be constructed rapidly(e.g., within a single day). The expression constructs will contain anantibiotic resistance gene other than kanamycin (e.g., ampicillin).Using the site-specific recombinase, Cre, a precise fusion between thepUNI vector and any other loxP site-containing vector comprising thedesired expression signals adjacent to the loxP site is catalyzed. Thesite-specific recombination event which occurs between the single loxPsites located on each plasmid (e.g., pUNI and the expression vector)results in the stable fusion of these two plasmids in such a manner asto place the expression of the gene of interest under the control of theexpression signals contained within the expression vector. Thissubcloning event occurs without the need to use restriction enzymes. Thefusion of pUNI-10 and the expression vector is selected for by selectingfor the ability of E. coli cells that do not express the II protein togrow in the presence of kanamycin. pUNI cannot replicate in E. colicells that do not express the II protein unless pUNI has fused orintegrated into another plasmid that contains a normal (i.e., not aconditional) origin of replication (e.g., the Col E1 origin). In thiscase, pUNI will be replicated (as part of the fusion plasmid) andkanamycin resistance will be conferred on the host cell.

[0164] a) Generation of pUNI-10

[0165]FIG. 2A provides a schematic map of the pUNI-10 vector; thelocations of selected restriction enzyme sites are indicated (with theexception of NotI, all sites shown are unique). FIG. 2B shows the DNAsequence of the loxP site and the polylinkers contained within pUNI-10(i.e., nucleotides 401-530 of SEQ ID NO:1).

[0166] Nucleotides 1-400 of pUNI-10 contain the conditional origin ofreplication from R6Kγ (OriR_(R6Kγ)); the OriR_(R6Kγ) was derived fromthe plasmid R6K (ATCC 37120) [Metcalf et al. (1996) Plasmid 35:1];nucleotides 401-414 comprise a NotI-KpnI polylinker that facilitates theexchange of lox sites; pUNI-10 contains a wild-type loxP site (asdiscussed above, pUNI vectors containing modified lox sites may beemployed). Nucleotides 415-448 comprise the wild-type loxP site;nucleotides 449-527 comprise a polylinker used for the insertion of thegene of interest (genomic or cDNA sequences). Nucleotides 528-750contain the polyA addition sequence from bovine growth hormone (BGH)(the BGH polyA sequence is available on a number of commerciallyavailable vectors including pcDNA3.1 (Invitrogen)); the BGH polyAsequence provides a 3′ end for transcripts expressed in mammalian andother eukaryotic cells. The art is aware of other eukaryotic polyAsequences that may be used in place of the BGH polyA sequence (e.g., theSV40 poly A sequence, the TK polyA sequence, etc.). Nucleotides 751-890contain the T7 terminator sequence which is used to terminatetranscription in prokaryotic hosts (numerous prokaryotic terminationsignals are known to the art and may be employed in place of the T7terminator sequence). Nucleotides 890-895 comprise an EcoRV restrictionenzyme recognition site and nucleotides 896-2220 comprise the kanamycinresistance gene (Kan or Kn^(R)) from Tn5 which provides a positiveselectable marker. The Kn^(R) gene found on pUNI-10 was modified usingsite-directed mutagenesis to remove the naturally occurring NcoI sitesuch that pUNI-10 contains a unique NcoI site in the polylinker regionlocated at nucleotides 449-527. pUNI vectors need not contain a Kn^(R)gene (modified or wild-type); other selectable genes may be used inplace of the Kn^(R) gene (e.g., ampicillin resistance gene, tetracyclineresistance gene, zeocin™ resistance gene, etc.). The pUNI vector neednot contain a selectable marker, although the use of a selectable markeris preferred. When a selectable marker is present on the pUNI vector,this marker is preferably a different selectable marker than thatpresent on the pHOST vector. The nucleotide sequence of pUNI-10 isprovided in SEQ ID NO:1.

EXAMPLE 2 Construction of Host Plasmids for Use in the UnivectorPlasmid-Fusion System

[0167] Host plasmids used in the Univector plasmid fusion system arereferred to as pHOST plasmids. pHOST plasmids or vectors are generallyexpression vectors that have been modified by the insertion of asite-specific recombination site, such as a lox site. The presence ofthe lox site on the pHOST plasmid permits the rapid subcloning orinsertion of the gene interest contained within a pUNI vector togenerate an expression vector capable of expressing the gene ofinterest. The pHOST vector may encode a protein domain such as anaffinity domain including, but not limited to, glutathione-S-transferase(Gst), maltose binding protein (MBP), a portion of staphylococcalprotein A (SPA), a polyhistidine tract, etc. A variety of commerciallyavailable expression vectors encoding such affinity domains are known tothe art. When the pHOST plasmid contains a vector-encoded affinitydomain, a fusion protein comprising the vector-encoded affinity domainand the protein of interest is generated when the pUNI and pHOST vectorsare recombined.

[0168] In some embodiments of the present invention, the host vectorfeatures include the Col E1 origin of replication and the bla gene forpropagation and selection in bacteria, a loxP site for plasmid fusionsand a specific promoter residing upstream of, and adjacent to, the loxPsite. Host vectors may also comprise sequences responsible forpropagation, selection, and maintenance in organisms other than E. coli.

[0169] To generate expression vectors intended to generatetranscriptional fusions (i.e., pHOST does not contain a vector-encodedprotein domain), a lox site is placed after (i.e., downstream of) thestart of transcription in the host vector. This is easily accomplishedusing synthetic oligonucleotides comprising the desired lox site. Indesigning the oligonucleotide comprising the lox site, care is taken toavoid introducing an ATG or start codon that might initiate translationinappropriately.

[0170] To generate expression vectors intended to generate a fusionprotein between a vector-encoded protein domain and the protein ofinterest (encoded by the gene of interest contained within the pUNIvector), care is taken to place the lox site in the correct readingframe such that 1) an open reading frame is maintained through the loxsite on pHOST and 2) the open reading frame in the lox site on pHOST isin frame with the open reading frame found on the lox site containedwithin the pUNI vector. In addition, the oligonucleotide comprising thelox site on pHOST is designed to avoid the introduction of in-frame stopcodons. The gene of interest contained within the pUNI vector is clonedin a particular reading frame so as to facilitate the creation of thedesired fusion protein.

[0171] The modification of several expression vectors is provided belowto illustrate the creation of suitable pHOST vectors. In each case, thegeneral strategy involved the generation of a linker containing a loxsite by annealing two complementary oligonucleotides. The annealedoligonucleotides form a linker having sticky ends that are compatiblewith ends generated by restriction enzymes whose sites are convenientlylocated in the parental expression vector (e.g., within the polylinkerof the parental expression vector).

[0172] a) Modification of the pGEX-2TKcs Prokaryotic Expression Vector

[0173] pGEX-2TKcs is an expression vector active in E. coli cells whichis designed for inducible, intracellular expression of genes or genefragments as fusions with Gst. pGEX-2TKcs contains the IPTG-inducibletac promoter (P_(tac)) and was derived from pGEX-2TK (Pharmacia Biotech)as follows. The polylinker sequence of pGEX-2TK, 5′-GGATCCCCGGGAATTC-3′(SEQ ID NO:2), was replaced with the following sequence:5′-GGATCGCATATGCCCATGGCTCGAGGATCCGAATTC-3′ (SEQ ID NO:3) to generate thepGEX-2TKcs vector.

[0174] A linker containing a loxP site was generated by annealing thefollowing oligonucleotides: 5′-CATGGCTATAACTTCGTATAGCATACATTATACGAAGTTATG-3′ (SEQ ID NO:4) and 5′-GATCCATAACTTCGTATAATGTATGCTATACGAAGTTATAGC-3′ (SEQ ID NO:5). When annealed, these twooligonucleotides form a double-stranded linker having a 5′ endcompatible with an NcoI sticky end and a 3′ end compatible with a BamHIsticky end (FIG. 3A). pGEX-2TKcs was digested with NcoI and BamHI (FIG.3B) and the annealed loxP linker was inserted to form pGst-lox.

[0175] b) Modification of the pVL1392 Baculovirus Expression Vector

[0176] pVL1392 is an expression vector that contains the polyhedrinpromoter which is active in insect cells (Pharmingen). A linkercontaining a loxP site was generated by annealing the followingoligonucleotides: 5′-GGCCGGACGTCATAACTTCGTAT AGCATACATTATACGAAGTTATG-3′(SEQ ID NO:6) and 5′-GATCCATAACTTC GTATAATGTATGCTATACGAAGTTATGACGTCC-3′(SEQ ID NO:7). When annealed, these two oligonucleotides form adouble-stranded linker having a 5′ end compatible with a NotI sticky endand a 3′ end compatible with a BamHI sticky end (FIG. 4A). pVL1392 wasdigested with NotI and BamHI (FIG. 4B) and the annealed loxP linker wasinserted to form pVL1392-lox.

[0177] c) Modification of the pGAP24 Yeast Expression Vector

[0178] pGAP24 is an expression vector that is based on the yeast 2 μmcircle and contains the constitutive GAP (glyceraldehyde 3-phosphatedehydrogenase) promoter (P_(GAP)) which is active in yeast cells and theTRP1 gene (used a selectable marker when the cells are grown in mediumlacking tryptophan) [the GAP promoter is available on pAB23; Schilds(1990) Proc. Natl. Acad. Sci. USA 87:2916]. A linker containing a loxPsite was generated by annealing the following oligonucleotides:5′-TCGAGAC GTCATAACTTCGTATAGCATACATTATACGAAGTTATGC-3′ (SEQ ID NO:8) and5′-GGCCGCATAACTTCGTATAATGTATGCTATACGAAGTTATGACGTC-3′ (SEQ ID NO:9). Whenannealed, these two oligonucleotides form a double-stranded linkerhaving a 5′ end compatible with a XhoI sticky end and a 3′ endcompatible with a NotI sticky end (FIG. 5A). pGAP24 was digested withXhoI and NotI (FIG. 5B) and the annealed loxP linker was inserted toform pGAP24-lox.

[0179] d) Modification of the pGAL14 Yeast Expression Vector

[0180] pGAL14 is a yeast centromeric expression vector that contains theGAL promoter (P_(GAL)), which is induced by the presence of galactose inthe medium, and the TRP1 gene. A linker containing a loxP site wasgenerated by annealing together the oligonucleotides listed in SEQ IDNOS:8 and 9. When annealed, these two oligonucleotides form adouble-stranded linker having a 5′ end compatible with a XhoI sticky endand a 3′ end compatible with a NotI sticky end (FIG. 6A). pGAL14 wasdigested with XhoI and NotI (FIG. 6B) and the annealed loxP linker wasinserted to form pGAL14-lox.

EXAMPLE 3 Expression and Purification of a Gst-Cre Fusion Protein

[0181] In order to provide a source of purified Cre recombinase for thein vitro recombination of plasmids, the cre gene was inserted into a Gstexpression vector such that a fusion protein comprising Gst at theamino-terminal end and Cre recombinase at the carboxy-terminal end wasproduced. The Gst-Cre fusion protein was purified by chromatographyusing Glutathione Sepharose 4B (Pharmacia). Purified Gst-Cre can bestored at −80° C., −20° C., or 4° C. for several months withoutsignificant loss of activity.

[0182] To simplify Cre purification, a plasmid expressing a GST-crefusion protein was constructed, pQL123. The cre gene was isolated bypolymerase chain reaction (PCR) amplification using the plasmid pBS39(U.S. Pat. No. 4,959,317). U.S. Pat. Nos. 4,683,195, 4,683,202 and4,965,188 describe PCR methodology and are incorporated herein byreference. The primers used in the PCR were designed to introduce anNcoI site at the first ATG in the cre open reading frame. The PCRproduct was cloned into a TA cloning vector (pCRII.1; Invitrogen) andthen was subcloned as an NcoI-EcoRI fragment into pGEX-2TKcs (Example 2)to generate pQL123. The ligation products were used to transform DH5αcells and the desired recombinant was isolated and used to transformBL21(DE3) cells (Invitrogen).

[0183] The nucleotide sequence of the Gst-Cre coding region withinpQL123 is listed in SEQ ID NO:10 (FIG. 26B). The amino acid sequence ofthe fusion protein expressed by pQL123 is listed in SEQ ID NO:11 (FIG.26C).

[0184] To express the Gst-Cre fusion protein, BL21(DE3) cells containingthe pQL123 plasmid were grown at 37° C. in LB containing 100 μg/mlampicillin until the OD₆₀₀ reached 0.6. Expression of the fusion proteinwas then induced by the addition of IPTG to a final concentration of 0.4mM and the cells were allowed to grow overnight at 25° C. Followinginduction, the bacterial cells were pelleted by centrifugation at5,000×g at 4° C. and the supernatant was discarded. A cell lysate wasprepared as follows. Cells harvested from 0.5 liter of culture weresuspended in 35 ml of a solution containing 20 mM Tris-HCl, pH 8.0, 0.1M NaCl, 1 mM EDTA, 0.5% Nonidet P-40, 5 μg/ml of each of leupeptin,antipain, aprotinin and 1 mM PMSF at 4° C. The cells were incubated for10 min on ice and then disrupted by sonication (3×15 sec bursts) using asonicator (Ultrasonic Heat Systems Model 200R) at full power. The lysatewas then clarified by centrifugation at 12,000 rpm using a SS34 rotor(Sorvall).

[0185] The Gst-Cre fusion protein was affinity purified from the celllysate by chromatography on Glutathione Sepharose 4B (Pharmacia)according to the manufacturer's instructions. The protein concentrationof Gst-Cre was determined by Bradford analysis (BioRad).

[0186] Aliquots of the cell lysate before and after chromatography onGlutathione Sepharose 4B were applied to an SDS-PAGE gel. Followingelectrophoresis, the gel was stained with Coomassie blue. The stainedgel is shown in FIG. 7. In FIG. 7, lanes 1 and 2 contain the cell lysatebefore and after chromatography, respectively. The arrowhead indicatesthe Gst-Cre fusion protein. The migration of the molecular weightprotein markers is indicated to the left of lane 1. The results shown inFIG. 7 demonstrate the purification of the Gst-Cre fusion protein. Thisfusion protein was shown to be functional (i.e., capable of mediatingrecombination between lox sites) in the in vitro recombination assaydescribed below.

[0187] Gst-Cre retained high recombinase activity as measured by UPS.The efficiency of this reaction reached up to 16.8% as shown in FIG. 15,similar to that for native Cre (Abremski et al., supra). In this figure,the indicated amounts of Gst-Cre were incubated with pUNI-10 and pQL103plasmid DNA as described below. Percentage of recombinants werecalculated by measuring the ratio of total kanamycin resistanttransformants (fusion events between pUNI-10 and pQL103) relative tototal ampicillin resistant transformants (pQL103 alone andpUNI-10-pQL103 fusions). The efficiency of Gst-Cre was examined in asecond reaction producing a tagged recombinant protein as diagrammed inFIG. 24, fusing a Gst tag to Skp1. Recombinant plasmids isolated fromKn^(r) transformants were shown by restriction analysis to be correctfusion products between the Univector and the host vector via the loxPsites. In this case, 10 of 12 Kn^(r) transformants were the correctheterodimer (FIG. 9) and 2 were trimers (FIG. 9, lanes 8 and 10) withtwo copies of pUNI fused to a host vector. It should be noted thattrimeric plasmids also have a correct fusion junction that places thegene of interest adjacent to the desired regulatory sequences and arefully functional for most needs. However, the isolation of trimericplasmids can be nearly eliminated if gel purified monomeric supercoiledhost DNA is used. This method is highly efficient and typically requiresonly one or two minipreps to identify the desired construct.

EXAMPLE 4 In Vitro Recombination Using the Univector Plasmid FusionSystem

[0188] The Univector Plasmid Fusion System permits the in vitrorecombination of two plasmids. FIG. 8 provides a schematic showing thestrategy employed for in vitro recombination. pA represents a genericpUNI vector that contains a loxP site, a kanamycin resistance gene andthe conditional R6K origin that is only functional in E. coli strainsexpressing the II protein (e.g., E. coli strains BW18815, BW19094,BW20978, BW20979, BW21037, BW21038). pB represents a generic pHOSTvector that contains a loxP site, an ampicillin resistance gene and aCol E1 origin of replication. pAB represents the fused plasmid whichresults from the Cre-mediated fusion of pA and pB.

[0189] To illustrate the in vitro recombination reaction, pUNI-5 (a pUNIvector which differs from pUNI-10 only in that pUNI-5 retains the NcoIsite in the Kn^(R) gene and contains a different polylinker) wasemployed as pA and pQL103, an ampicillin-resistant plasmid containing aloxP site and the ColE1 origin, was employed as pB. In a total reactionvolume of 20 μl, 0.2 μg of each pUNI-5 (pA) and pQL103 (pB) were mixedin a buffer containing 50 mM Tris-HCl (pH 7.5), 10 mM MgCl₂, 30 mM NaCland 1 mg/ml BSA. The amount of purified Gst-Cre (Example 3) was variedfrom 0 to 1.0 μg. The reactions were incubated at 37° C. for 20 minutesand then the reactions were placed at 70° C. for 5 min. to inactivatethe Gst-Cre protein. Five microliters of each reaction mixture were useddirectly to transform competent DH5α cells (CaCl₂ treated). Thetransformed cells were plated onto LB/Amp (100 μg/ml amp) and LB/Kan (40μg/ml kan) plates and the number of ampicillin resistant (Ap^(R)) andkanamycin-resistant (Kn^(R)) colonies were counted. The results aresummarized in Table 1. TABLE 1 Gst-Cre (μg/reaction) Ap^(R) ColoniesKn^(R) Colonies % of Total Kn^(R)/Ap^(R) 0 2.6 × 10⁴ 0 0 0.01 1.9 × 10⁴571 3 0.05 1.1 × 10⁴ 682 6.2 0.1 1.5 × 10⁴ 502 3.3 0.5 0.3 × 10⁴ 104 3.41.0 0.3 × 10⁴ 52 1.7

[0190] The results shown in Table 1 demonstrate, that under thesereaction conditions 0.05 μg purified Gst-Cre per 20 μl reaction yieldsthe most efficient rate of plasmid fusion. Plasmid DNA was isolated fromindividual kanamycin-resistant colonies (using standard mini-prepplasmid DNA isolation protocols) and subjected to restriction enzymedigestion to determine the structure of the fused plasmids. Thisanalysis revealed that plasmid DNA isolated from the kanamycin-resistantcolonies represented a dimer created by the desired fusion of pUNI-5 andpQL103 via the loxP sites. These results demonstrate that the UnivectorPlasmid Fusion System can be used to rapidly fuse two plasmids togetherin vitro.

EXAMPLE 5 In Vitro Fusion Between a pUNI Vectors Containing Genes ofInterest and Lox-Containing Expression Vectors Produces Fused VectorsCapable of Expressing the Gene of Interest

[0191] In Example 4 it was demonstrated that the Univector PlasmidFusion System can be used to rapidly fuse two plasmid constructstogether in vitro. In this example, the ability of the Univector PlasmidFusion System to fuse two plasmids together in a manner that places thegene of interest contained on the pUNI vector under the transcriptionalcontrol of a promoter contained on the pHOST or expression vector insuch a manner that a functional protein of interest is expressed fromthe fused construct. A series of expression plasmids were made by UPSand tested for expression in several contexts.

[0192] a) Insertion of a Gene of Interest Into the pUNI-10 Vector

[0193] The cDNA encoding the wild-type yeast Skp1 protein [Bai et al.(1996) Cell 86:263] was cloned into the pUNI-10 vector between the NdeIand BamHI sites to generate pUNI-Skp1; the yeast SKP1 cDNA sequence isavailable as GenBank Accession No. U61764. Skp1 is an essential proteininvolved in the regulation of the cell cycle in yeast. Yeast cellscontaining a temperature sensitive mutant of Skp1 cannot grow at thenon-permissive temperature (37° C.).

[0194] b) In Vitro Fusion Reactions and Complementation Assays

[0195] pUNI-Skp1 was recombined with pGAP24-lox (Example 2) andpGAL14-lox (Example 2) using the in vitro reaction described in Example4; 0.2 μg of Gst-Cre was used per 20 μl reaction. The resulting plasmidfusions were termed pGAP24-Skp1 and pGAL14-Skp1. pGAP24-Skp1 andpGAL14-Skp1 were then transformed into the temperature sensitive (ts)skp1-11 mutant yeast strain Y555 (Bai et al., supra) and the transformedyeast cells were plated onto SC-tryptophan plates (to select for theexpression of the selectable marker TRP1) and incubated at either apermissive (25° C.) or non-permissive temperature (37° C.). The plateswhich received yeast cells transformed with pGAL14-Skp1 containedgalactose. The ability of the transformed cells to grow at thenon-permissive temperature is dependent upon the expression of thewild-type skp1 gene encoded by a properly fused pUNI-Skp1/expressionvector construct. As a control, the yeast SKP1 genomic clone containedin a URA3 CEN vector (produced by conventional cloning techniques) wasused to transform the ts skp1-11 mutant yeast strain Y555 and thetransformed cells were also plated at 25° C. and 37° C. In each case, anexpression vector (e.g., pRS414 or pRS415; Bai et al., supra) lackingthe SKP1 gene but containing the same selectable marker (i.e., TRP1) aseither pGAP24-Skp1, pGAL14-Skp1 or URA3 CEN-Skp1 was used to transformY555 cells as a control capable of permitting the growth of transformedY555 cells on selective medium at the permissive temperature.

[0196] The results demonstrated that the URA3 CEN-SKP1 constructproduced by conventional cloning techniques produced a functional Skp1protein which was capable of complementing the lethality of the skp1-11ts mutation. More importantly, the results demonstrated that the invitro fusion reaction that created pGAP24-Skp1 and pGAL14-Skp1 producedconstructs capable of producing functional Skp1; that is, Y555 cellstransformed with either pGAP24-Skp1 or pGAL14-Skp1 were capable ofgrowth at 37° C., a temperature at which the ts Skp1-11 protein producedby the host strain is non-functional. Expression vectors lacking theSKP1 cDNA were incapable of complementing the lethality of the skp1-11ts mutation.

[0197] c) Restriction Analysis, SDS-PAGE Analysis and Western BlotAnalysis of in Vitro Fusion Reactions

[0198] pUNI-Skp1 was recombined with pGst-lox (Example 2) using the invitro reaction described in Example 4; 0.2 μg of Gst-Cre was used per 20μl reaction. The resulting plasmid fusion was termed pGST-Skp1. FIG. 9Aprovides a schematic showing the starting constructs and the predictedfusion construct. Five microliters of the fusion reaction mixture wasused transform DH5α cells as described in Example 4. The transformedcells were plated onto LB/Amp/Kan plates and plasmid DNA was isolatedfrom individual Ap^(R)Kn^(R) colonies. The plasmid DNAs were digestedwith PstI followed by electrophoresis on agarose gels to examine thestructure of the fused plasmids. A representative ethidiumbromide-stained gel is shown in FIG. 9B. In FIG. 9B, lane “M” containsDNA size markers, lanes pUNI-Skp1 and pGst-lox contain the startingplasmids digested with PstI and lanes 1-12 contain plasmid DNA fromindividual Ap^(R)Kn^(R) colonies digested with PstI. Lanes marked withan “*” indicate that these colonies contained a trimeric fusion plasmidthat resulted from the fusion of two Gst-lox plasmids and one pUNI-Skp1plasmid. The sizes of the two PstI fragments which result from thefusion of pUNI-Skp1 and pGst-lox in kb are indicated (5.8 and 2.0 kb).The results shown in FIG. 9B demonstrate that the in vitro fusionreaction resulted in the production of the desired fused construct withhigh efficiency (about 83% of the plasmids in the Ap^(R)Kn^(R) coloniescomprised the fusion of one pUNI-Skp1 vector with one pGst-lox vector).

[0199] Three individual Ap^(R)Kn^(R) colonies were picked and grown inliquid cultures which were induced with IPTG to examine whether thefused construct (pGst-Skp1) could produce the desired Gst-Skp1 fusionprotein. The cultures were grown, induced and cell extracts wereprepared as described in Example 6. An aliquot of the cell lysatesprepared from induced and uninduced cells were electrophoresed on anSDS-PAGE gel and the gel was either stained with Coomaise blue ortransferred to nitrocellulose to generate a Western blot. The Westernblot was probed using an anti-Skp1 polyclonal antibody (the antibody wasraised against the yeast Skp1 using conventional methods). The resultingCoomassie-stained gel and Western blot are shown in FIGS. 10A and 10B,respectively.

[0200] In FIG. 10A, lane “M” contains protein molecular weight markers(size in kd is indicated). Lanes marked “C” contain extracts preparedfrom E. coli containing a GST-SKP1 construct made by conventionalcloning (i.e., the SKP1 cDNA was excised using restriction enzymes andinserted into pGEX-2TKcs (Example 2)). Lanes 1-3 contain extracts fromAp^(R)Kn^(R) cells transformed with in vitro fusion reaction mixtures.Extracts prepared from uninduced cells and IPTG induced cells areindicated by “−” and “+”, respectively. The arrowheads indicate thelocation of the Gst-Skp1 fusion proteins. The Gst-Skp1 fusion productgenerated from the pGST-SKP1 fusion construct contains 15 additionalamino acids which are located between the Gst domain and the Skp1protein sequences relative to the Gst-Skp1 fusion protein expressed fromthe conventionally constructed GST-SKP1 plasmid (the additional 15 aminoacids are encoded by the linker comprising the loxP site; see FIG. 3).In FIG. 10B, the lane designations are the same as described for FIG.10A. This Western blot confirms that the bands indicated by thearrowheads in FIG. 10A represent Gst-Skp1 fusion proteins.

[0201] The results shown in FIGS. 10A and 10B demonstrate that theUnivector Fusion System can be used to create an expression vector thatmaintains the proper translational reading frame and permits theexpression of a fusion protein comprising the expression vector-encodedaffinity tag and the protein of interest.

[0202] The above results demonstrate that the Univector Fusion Systemcan be used to recombine two plasmids, one containing a gene of interestbut no promoter (this vector may optionally contain expression signalssuch as termination signals and/or polyadenylation signals) and theother containing a promoter and optionally other expression signals(e.g., splicing signals, translation initiation codons) (and optionallysequences encoding an affinity domain) but lacking a gene of interest,in vitro in such a manner that the proper translational reading frame ismaintained permitting the expression of a functional protein from thefused plasmids in the host cell.

[0203] d) Additional Examples

[0204] The S. cerevisiae SKP1 ORF (Bai et al., supra) in pUNI-10 wasfused to the pGST-lox host vector pHB2-GST by UPS to create a bacterialGst-lox-Skp1 fusion protein expressed under the control of the E. colitac promoter. A similar Gst-Skp1 expression plasmid lacking loxP (i.e.,pCB149) made by conventional cloning, was used as a control.Approximately equal amounts of the two fusion proteins were expressed asshown in FIGS. 16A and B, indicating that the presence of loxP did notsignificantly affect either the transcription or translation of thefusion protein. In this figure, proteins were separated by SDS-PAGE andstained with Coomassie blue (FIG. 16A) or immunoblotted (FIG. 16B) withanti-Skp1 antibodies. Protein from a control GST-Skp1 expression plasmidlacking loxP (lanes 1 and 2) and three independent transformants ofUPS-derived Gst-lox-Skp1 expression constructs (lanes 3-8) are shown.The asterisk denotes a degradation product.

[0205] In another example, to measure the effect of the loxP sequenceupon eukaryotic expression in the context of transcriptional fusions,the SKP1 ORF was placed under the control of the S. cerevisiae GAL1promoter both by conventional means and by UPS. In this case, it wasobserved that the relative expression level of the UPS-derived plasmidwas slightly lower. This reduction in expression might be explained bythe ability of loxP RNA to form a 13 bp stem-loop, as secondarystructures formed within the 5′ UTR of an mRNA can interfere with theinitiation of translation [Kozak (1989) Mol. Cell. Biol. 9:5134],although an understanding of the mechanism is not required to practicethe present invention, and the present invention is not limited to anyparticular mechanistic explanation. To test this hypothesis, a series oflox sites were made containing mutations designed to reduce thestability of the stem-loop, as described in Example 8.

[0206] In yet other examples, multiple genes have been tested using UPSand expressed in several different organisms. In addition to Gst-Skp1expression in bacteria, Myc-Rnr4 and Myc-Rad53 have been expressed in S.cervisiae as shown in FIG. 17, showing a comparison of expression levelsbetween loxP and loxH containing constructs. Protein extracts wereprepared from Y80 cells grown in SC-ura plus galactose containing thefollowing plasmids: vector alone (lane 1), pMH176 (GAL-MYC3-RNR4) madeby conventional cloning lacking a lox sequence (lane 2), UPS-derivedGAL-lox-MYC3-RNR4 constructs with either loxP (lane 3) or loxH (lane 4)present between the GAL1 promoter and the MYC3-RNR4 gene, vector alone(lane 5), and UPS-derived GAL1-MYC3-lox-RAD53 construct (lane 6). Therecipient vector for RAD53 was pHY314-MYC3.

[0207] Furthermore, many baculovirus expression constructs have beenmade by UPS and tested. Shown in FIG. 18, as illustrative examples, areGst-Rad53, Myc-Rad53, and HA-Rad53. For Rad53, the UPS-derivedconstructs express at the same level as Gst-Rad53 made by conventionalmethods (FIG. 18, compare lanes 1 and 2). FIG. 18 shows the expressionof the UPS-derived baculovirus expression constructs in insect cells.UPS reactions were performed between pUNI-10-RAD53 clones andbaculovirus expression vectors in pVL1392 backbones engineered tocontain lox sites and epitope tags. Host insect expression vectors usedwere pHI100-GST, pHI100-MYC3, and pHI100-HA3 and the resulting fusionplasmids were crossed onto Baculogold (Pharmingen) by standard methods.GST affinity purified protein from lysates from 1 million cells infectedwith baculovirus expressing either GST-RAD53 made by conventionalcloning (lane 1) or UPS (lane 2) were fractionated on a SDS-PAGE andCoomassie stained. Western blots of protein prepared from cells infectedwith the baculoviruses containing vector alone (lane 3), UPS-derivedMYC3-lox-RAD53 (lane 4), vector alone (lane 5), or UPS-derivedHA3-lox-RAD53 (lane 6) were probed with anti-Myc (lanes 3-4) or anti-HA(lane 5-6) monoclonal antibodies.

[0208] In yet other examples, in mammals, the present inventiondemonstrated expression of a Myc-tagged F-box protein under the controlof the CMV promoter when transfected into Hela cells as shown in FIG.19. This figure shows immunoblotting of whole cell lysates with anti-HAantibodies. The cells used were Hela cells transfected by the calciumphosphate method with the CMV expression vectors pHM200-HA3 orpHM200-HA3-F3, expressing an HA-tagged F-box protein. In all, over 200UPS derived constructs have been made and tested, showing expressionsuccess rates indistinguishable from those of conventional cloningmethods.

EXAMPLE 6 Construction of an E. coli Strain that Inducibly Expresses CreRecombinase

[0209] An E. coli strain containing a cre gene under the control of aninducible promoter, termed the QLB4 strain, was constructed as follows.The cre gene was placed under the transcriptional control of theinducible lac promoter by inserting the cre ORF into a derivative ofpNN⁴02 [Elledge et al. (1991) Proc. Natl. Acad. Sci. USA 88:1731];pNN402 was modified to contain a lac promoter. This construct was thencrossed onto lambda phage (e.g., λgt11) using conventional techniques.The recombinant lambda phage carrying the lac-cre gene was integratedinto the chromosome of E. coli strain JM107 to generate the QLB4 strain.

[0210] Expression of Cre recombinase was induced by growing QLB4 cellsat 37° C. until an OD₆₀₀ of 0.6 was reached. The culture was then splitinto 2 parts and IPTG was added to one part to a final concentration of0.4 mM. As a control, the BNN132 strain (ATCC 47059; Elledge et al.(1991), supra] which contains the cre gene under the transcriptionalcontrol of the endogenous cre promoter was treated as described for theQLB4 strain. Cell extracts (total protein) were prepared from all foursamples (QLB4±IPTG and BNN132±IPTG) and examined for expression of Crerecombinase by Western blotting analysis. The Western blot was probedusing a rabbit polyclonal anti-Cre antibody (Novagen) as the primaryantibody and a goat anti-rabbit IgG horseradish peroxidase conjugate(Amersham) as the secondary antibody according to the manufacturer'sinstructions. FIG. 11 shows a Western blot containing extracts preparedfrom (shown left to right) BNN123 cells grown in the absence of IPTG(“C”) and QLB4 cells grown in the absence (“QLB4−”) and presence of IPTG(“QLB4+”), respectively. The location of the Cre recombinase band isindicated by the arrowhead. The additional bands seen on this Wesrternblot are due to cross-reactivity of the crude (i.e., not affinitypurified) rabbit anti-Cre antibody with bacterial proteins.

[0211] Western blot analysis demonstrated that Cre protein could not bedetected in BNN123 cells grown in the presence or absence of IPTG. Creprotein was detected in QLB4 cells grown in the presence of IPTG, butnot in the absence of IPTG, by Western blot analysis. Therefore, theexpression of Cre recombinase in QLB4 cells is greatly induced by thepresence of IPTG in the growth medium. By this analysis, the expressionof Cre recombinase in QLB4 cells is dependent upon the induction of thelac-cre gene by IPTG. However, more sensitive functional assays indicatethat the Cre protein was expressed constitutively at very low levels inboth BNN132 cells and QLB4 cells in the absence of IPTG. In thesefunctional assays, a pUNI vector (Kn^(R)) and a pHOST vector (Ap^(R))were cotransformed into QLB4 cells and the transformed cells were grownon plates containing kanamycin to select for the presence of thepUNI-pHOST fusion plasmid. Plasmid DNA was isolated from individualkanamycin-resistant colonies and subjected to restriction enzymedigestion to examine the structure of the plasmid DNA. This analysisrevealed that multiple isoforms of the plasmid fusion product werepresent in the plasmid DNA isolated from any single kanamycin-resistantcolony. While not limiting the present invention to any particularmechanism, it is believed that low level constitutive expression of Crerecombinase leads to multiple fusion events between the pUNI and pHOSTvectors resulting in the production of multimeric forms (i.e., trimer,tetramer, etc. ) of the fused plasmid (the desired fused plasmid is adimer formed by fusion of pUNI and PHOST). The multimeric plasmid fusionproducts would be expected to be unstable due to the fact that the Creprotein is constitutively expressed in QLB4 cells.

[0212] To overcome the potential problems that low level constitutiveexpression of the cre gene in the host cell may cause, the expression ofcre can be more tightly controlled as described below. In addition tothe approaches described below, the pUNI and pHOST vectors can bemodified as described in Example 7 and these modified vectors can befused using a host cell that constitutively expresses the Cre protein.

[0213] The expression of Cre recombinase can be more tightly controlledby a variety of means. For example, the expression of the cre gene canbe made conditional when expressing cre under the control of the lacpromoter by growing the host cells in medium containing glucose. Thepresence of 0.2% glucose in the growth medium virtually shuts downtranscription from the lac promoter. In addition, the lac promoter canbe modified to insert additional operator (o) sites which bind the lacrepressor. Other tightly controlled promoters are known to the art(e.g., the T7 promoter which requires the expression of T7 RNApolymerase; these promoters are available on the pET vectors (Novagen))and may be employed to control the expression of the cre gene.

[0214] In addition to placing the cre ORF under the control of a tightlycontrolled promoter, Cre expression can be tightly controlled by placingthe cre gene on a plasmid containing a temperature-sensitive (ts)replicon (e.g., rep pSC101^(ts)). When the cre gene is carried on a tsreplication plasmid, Cre will be expressed during the transformation ofthe host cell (because the host cell containing the ts plasmidcontaining the cre gene was maintained at the permissive temperature)but will be absent following recombination of the pUNI and pHOST vectorswhen the host cell is grown at a temperature non-permissive forreplication of the ts replicon.

EXAMPLE 7 In Vivo Recombination in Prokaryotic Hosts Using the UnivectorFusion System

[0215] As discussed above, Cre-loxP-mediated plasmid fusion can occur invivo, although the reverse reaction, resolution of heterodimers, mightdecrease its utility. Ideally, it would be desirable to have Cre presentonly transiently to catalyze the initial fusion event, then absent toallow the stable propagation of the recombinant products. Therefore, amodel was tested whereby UPS was explored in vivo in the E. coli stainBUN13 that conditionally expresses Cre recombinase under lac control andin a second strain carrying cre on a plasmid, pQL269, with a Ts originof replication derived from pSC101. Experiments using BUN13 andco-transformation of pUNI-10 and pQL103, an Ap^(r)loxP containingplasmid, showed that the UPS reaction occurred efficiently, but manycolonies had a mixture of plasmids that required retransformation intonon-cre-expressing strain to stabilize. However, results with the Tsplasmid were better. Competent cells were prepared from JM107/pQL269cells grown at 42° C. for several hours to cause loss of pQL269.Co-transformation of pUNI-10 and pQL103 into these cells followed byselection on kanamycin plates at 42° C. revealed that 25% contained thedesired single pUNI-10-pQL103 co-integrant. These two experimentsdemonstrated that UPS can be used to generate plasmid fusions in vivoand provide an alternative to the in vitro reaction when Gst-Cre is notavailable.

[0216] As described in Example 6 and the experiments above,cotransformation of E. coli cells expressing Cre protein (e.g., QLB4,BNN132) with a pUNI construct and a pHOST construct (each constructcontaining a single lox site) results in the fusion of these twoconstructs in vivo. If the host cell used for the recombination reactionconstitutively expresses the Cre protein, multimeric forms of the fusedconstructs are generated. In addition to the methods outlined above fortightly regulating the expression of the cre gene in the host cell,cells constitutively producing Cre protein can be employed with modifiedpUNI and pHOST vectors as described in this example. The pUNI constructis modified such that two different lox sites flank the kanamycinresistance gene (the modified pUNI construct is termed pUNI-D). The twolox sites differ in their spacer regions by one or two nucleotides andfor the sake of discussion the two different lox sites are referred toas “loxΔ” and “loxB” (e.g., loxP and loxP511; “loxB” is used in thisdiscussion to distinguish it from the first lox site termed “loxA” anddoes not indicate the use of the loxB sequence found in the E. colichromosome). Cre cannot efficiently catalyze a recombination eventbetween a loxA site and a loxB due to the sequence changes located inthe spacer regions between the Cre binding sites; however Cre canefficiently catalyze the recombination between two loxA sites or twoloxB sites [Hoess et al. (1986) Nucleic Acids Res. 14:2287]. The pHOSTconstruct is modified such that one loxA site and one loxB site flankthe selectable marker gene (the modified pHOST construct is termedpHOST-D). In this example, pHOST contains the sacB gene as theselectable marker (a negative selectable marker). The presence of thesacB gene on pHOST-D provides a means of counter-selection as cellsexpressing the sacB gene are killed when the cell is grown in mediumcontaining 5% sucrose [Gay et al. (1985) J. Bacteriol. 164:918 and(1983) J. Bacteriol. 153:1424].

[0217]FIG. 12 provides a schematic showing the strategy for in vivorecombination in a Cre-expressing host cell (e.g., QLB4 cells) using thepUNI-D and pHOST-D constructs. Arrows are used to indicate the directionof transcription of various genes or gene segments in FIG. 12. In FIG.12, the following abbreviations are used: Ap^(R) (ampicillin resistancegene); Kn^(R) (kanamycin resistance gene); Ori (non-conditional plasmidorigin of replication); Ori^(R) (the R6Kγ conditional origin ofreplication); Cre (Cre recombinase); GENEX (gene of interest). Thestrategy outlined in FIG. 12 is referred to as the “in vivo gene-trap.”FIG. 12 illustrates that the second lox site (loxB) in pUNI-D (relativeto the design of the pUNI-10 vector) is inserted between the kanamycinresistance gene and the R6Kγ conditional origin of replication.

[0218] To generate a pHOST-D construct, a commercially availableexpression vector containing the desired promoter (and optionallyenhancer) is modified as described in Example 2 to insert the loxA sitedownstream of the promoter. However, it is not necessary that acommercially available expression vector be employed as the art is wellaware of methods for the generation of expression vectors. Sequencesencoding the sacB gene [Gay et al. (1983) J. Bacteriol. 153:1424;GenBank Accession Nos. X02730 and K01987] and the second lox site (loxB)are inserted downstream of the first lox site (loxA).

[0219] The pUNI-D and pHOST-D constructs are cotransformed into QLB4cells (Example 6) and the transformed cells are plated onto LB/Ap/Knplates containing 5% sucrose to select for the desired recombinant. FIG.12 illustrates the recombination events that will occur in the presenceof Cre in the QLB4 cells. First pUNI-D and pHOST-D will fuse to form twodimers in which two possible double cross-over events can occur. Thesetwo double cross-over events are diagrammed in FIG. 12. The doublecross-over events will-result in the exchange of the DNA segments thatare flanked by loxA and loxB to produce the plasmids labelled “A” and“B.” All plasmids that contain the sacB gene (the pHOST-D, the fusedplasmids and plasmid B) will be selected against by the presence ofsucrose in the growth medium. The pUNI-D construct will not be able toreplicate in QLB4 cells as these cells do not express the II proteinrequired for replication of the R6Kγ origin. Therefore, the onlyconstruct that will be maintained in QLB4 cells selected on LB/Kncontaining sucrose is the desired plasmid A in which the gene ofinterest from pUNI-D has been placed under the transcriptional controlof the promoter located on pHOST-D.

[0220] To illustrate this method, pUNI-10 was modified to place a secondlox site, comprising the loxP511 sequence (SEQ ID NO:16) between thekanamycin resistance gene and the R6Kγ conditional origin of replicationto create pUNI-10-D. A second lox site, comprising the loxP511 site, wasinserted onto a loxP-containing expression plasmid (i.e., a pHOSTvector) to create a pHOST-D vector. One-half of one microgram of eachplasmid was cotransformed into competent QLB4 cells and an aliquot ofthe transformed cells were plated onto LB/Ap plates and onto LB/Ap/Knplates containing 5% sucrose and the number of colonies on each type ofplate were counted. The percentage of Ap^(R)Kn^(R) colonies which grewon sucrose-containing plates relative to the number of AP^(R) colonieswas 1% (1×10³/1×10⁵). Restriction enzyme digestion of plasmid DNAisolated from individual Ap^(R)Kn^(R) colonies which grew onsucrose-containing plates confirmed that the desired fusions had beengenerated. These results indicate that the in vivo gene trap method canbe used to recombine a gene of interest carried on a pUNI-D vector intoan expression vector using host cells that constitutively express theCre protein.

[0221] In addition to providing a means for recombining a gene ofinterest carried on a pUNI-D vector into an expression vector using hostcells that constitutively express the Cre protein, the in vivo gene trapmethod provides a means to transfer a gene of interest contained on alinear DNA molecule (e.g., a PCR product) that lacks a selectable markerinto an expression vector(s). The desired PCR product is amplified usingtwo primers, each of which encode a different lox site (a “loxA” and“loxB” site such as a loxP and loxP511 site). A pUNI vector isconstructed that contains (5′ to 3′) a loxA site, a counter-selectablemarker such as the sacB gene and a loxB site (i.e., the two differentlox sites flank the counter-selectable marker). This pUNI vector alsocontains a conditional origin of replication and an antibioticresistance gene as described above and in Example 1. The PCR product(loxA-amplified sequence-loxB) is recombined with the modified pUNIvector (which comprises loxA-counter-selectable marker-loxB) to create apUNI vector containing the PCR product which now lacks thecounter-selectable marker. This recombination event is selected for bygrowing the host cells in medium that kills the host if thecounter-selectable gene is expressed. The PCR product in the pUNI vector(containing 2 lox sites) can then be placed under the control of thedesired promoter element by recombining the pUNI/PCR product constructwith the appropriate pHOST-D vector.

EXAMPLE 8 The Use of Modified LoxP Sites to Increase Expression of theProtein of Interest

[0222] The pUNI and pHOST constructs employed in the Univector PlasmidFusion System were designed such that plasmid fusion resulted in theintroduction of a lox site between the promoter and the gene ofinterest. LoxP sites consist of two 13 bp inverted repeats separated byan 8 bp spacer region [Hoess et al. (1982) Proc. Natl. Acad. Sci. USA79:3398 and U.S. Pat. No. 4,959,317]. Transcripts of the gene ofinterest produced from a pUNI-pHOST fusion construct comprising a loxPsite may have two 13 nucleotide perfect inverted repeats within the 5′untranslated region (UTR) that have the potential to form a stem-loopstructure (this will occur in those cases where pHOST does not encode anaffinity domain at the amino-terminus of the fusion protein). It iscurrently believed that the ribosome scanning mechanism is the mostcommonly used mechanism for initiation of translation in eukaryotes(e.g., yeast and mammalian cells). Using this mechanism, the ribosomebinds to the 5′ cap structure of the mRNA transcript and scansdownstream along the 5′ UTR searching for the first ATG or translationstart codon. Without limiting the present invention to any particularmechanism, it is possible that a stem-loop structure formed by thepresence of a loxP sequence on the 5′ UTR of the mRNA encoding theprotein of interest would block or reduce the efficiency of ribosomescanning and thus the translation initiation step could be impaired.There is evidence that stem-loop structures in the 5′ UTR of particularmRNAs reduce the efficiency of translation in eukaryotes [see, e.g.,Donahue et al. (1988) Mol. Cell. Biol. 8:2964 and Yoon et al. Genes andDev. (1992) 6:2463]. It is noted that no evidence suggests that thepresence of a stem-loop structure in the coding region (as opposed tothe 5′ UTR) of a transcript negatively affects its ability to betranslated. It is likely that the energy of protein synthesis issufficient to overcome secondary structures present in mRNAs. Indeed thedata presented in Example 5 shows that a GST-SKP1 fusion constructproduced using the Univector Fusion System (i.e., the construct containsa loxP site between the sequences encoding the Gst and Skp1 domains)produced the same level of fusion protein as did a conventionalconstruct encoding a Gst-Skp1 fusion protein which lacks the loxPsequence. Therefore, concerns over the presence of a stem-loop structurecaused by the presence of a lox sequence in a transcript encoded by apUNI-pHOST fusion construct are limited to those constructs that do notgenerate fusion proteins.

[0223] If low levels of expression are observed when a gene of interestis expressed from a pUNI-pHOST fusion constructs comprising loxsequences that comprise perfect 13 bp inverted repeats (e.g., loxP),pUNI and pHOST constructs containing mutated loxP sequences areemployed. The mutated loxP sequences comprise point mutations thatcreate mismatches between the two 13 bp inverted repeat sequences withinthe loxP site that disrupt the formation of or reduce the stability of astem loop structure. Specifically, two modified loxP sites were designedthat have mismatches at different positions in the inverted repeatslocated within a loxP site. The 13 bp inverted repeats are binding sitesfor the Cre protein; thus, each loxP site has two binding sites for Cre.For the purpose of discussion, these two binding sites are referred toas L and R (left and right). The wild-type loxP site is designedL(0)-R(0) wherein “0” indicates the absence of a mutation (i.e., thewild-type sequence). Two derivatives of the wild-type loxP sequence weredesigned and termed loxP2 and loxP3. The sequence of loxP2 (SEQ IDNO:13), loxP3 (SEQ ID NO:14), as well as the wild-type loxP sequence(SEQ ID NO:12) are shown in FIG. 13. LoxP2 is placed on the pUNI-10construct (in place of the wild-type loxP site) and loxP3 is placed onthe pHOST construct.

[0224] LoxP2 has repeats designated L(3,6)-R(0) which indicates that thethird and sixth nucleotides of the left repeat are mutated; thus, amismatch is introduced at the third and sixth positions between the Land R repeats of the loxP2 site. LoxP3 has repeats designated L(0)-R(9)which indicates that the ninth nucleotide on the right repeat sequenceis mutated to introduce a mismatch at the ninth position between the Land R repeats of the loxP3 site. Fusion between the loxP2 site on thepUNI construct and the loxP3 site on the pHOST construct will generate ahybrid loxP23 site [L(3,6)-R(9)] located between the promoter and thegene of interest and a wild-type IoxP site [L(0)-R(O)] at the distaljunction. Thus, the loxP23 site (SEQ ID NO:15) in the 5′ UTR will havethree mismatches distributed at positions 3, 6 and 9 between the 13nucleotide inverted repeats which are expected to strongly destabilizethe formation of the stem-loop structure. Other mutated loxP sequencessuitable for disruption of the stem-loop structure will be apparent tothose skilled in the art; therefore, the present invention is notlimited to the use of the loxP2 and loxP3 sequences for the purpose ofdisrupting stem-loop formation on the 5′ UTR of transcripts producedfrom pUNI-pHOST fusion constructs. The suitability of any pair ofmutated lox sites for use in the Univector Fusion system may be testedby placing one member of the pair on a pUNI vector and the other memberon a pHOST construct. The two modified vectors are then recombined invitro as described in Example 4 and the fusion reaction mixture is usedto transform E. coli cells and the transformed cells are plated onselective medium (e.g., on LB/Amp and LB/Kan plates) in order todetermine the efficiency of recombination between the two mutated loxsites (Example 4). The efficiency of recombination between the twomutated lox sites is compared to the efficiency of recombination betweentwo wild-type loxP sites. Any pair of two different mutant lox sitesthat recombines at a rate that is about 5% or greater than that observedusing two loxP sites is a useful pair of mutated lox sites for use inavoiding the formation of a stem-loop structure on the 5′ UTR of themRNA transcribed from the pUNI/pHOST fusion construct.

[0225] A strategy as described above was employed to determine if thereduced expression observed with the SKP1 ORF under control of the GAL1promoter as described in Example 5 could be improved with mutated loxsites. A series of lox sites designed to reduce the stability of thestem-loop were employed. These, together with a control scrambled site,loxS, were placed between the GAL1 promoter and the lacZ reporter geneand β-galactosidase expression was measured. Mutations that decreasedstem-loop stability tended to express better and one mutant,loxP^(L369), did not display any inhibitory effects. This mutant alsoretained 25% of the wild-type recombination efficiency and has beendesignated loxH (i.e., for host). The oligonucleotides used to generatethe loxH site are based on the loxH sequence5′-ATTACCTCATATAGCATACATTATACGAAGTTAT-3′ (SEQ ID NO:32). LoxH wasfurther tested by using it to place MYC-RNR4 under GAL1 control andshowed no translational interference, as shown in FIG. 17 (compare lanes2, 3, and 4). LoxH's 25% recombinational efficiency is well within therange useful for UPS-mediated plasmid constructions. Thus, it isrecommended that loxH be used in pHOST recipient vectors intended fortranscriptional fusions to maximize expression, while loxP should beused for all other applications because of its higher recombinationefficiency.

[0226] It will be apparent to those skilled in the art that a similarstrategy can be employed for the modification of frt sites when the FLPrecombinase is employed for the recombination event. The frt site, likelox sites, contains two 13 bp inverted repeats separated by an 8 bpspacer region.

EXAMPLE 9 Precise ORF Transfer (POT)

[0227] In order to transfer only the gene of interest from the Univectorto the Host vector, the present invention provides a secondrecombination event that allows a resolution of the UPS generatedheterodimer. A schematic representation of the POT reaction is shown inFIG. 20. In one embodiment of the present invention, a R-recombinationsite, RS, was placed after the cloning site in pUNI (i.e., pUNI-20) suchthat any gene inserted into pUNI-20 would be flanked on the 5′ side byloxP and on the 3′ side by RS, although the present inventioncontemplates the use of any other second recombination system (e.g., theRes system). Host recipient vectors must also contain lox and RSelements in the correct order. The initial fusion event is catalyzed byCre by UPS. The second reaction can be catalyzed in vitro by incubationwith purified R-recombinase (Araki et al., supra) or in vivo bytransformation into a strain (e.g., BUN15) expressing the R-recombinaseunder tac control on a Ts replication plasmid (e.g., pML66) that is lostwhen cells are plated at 42° C. POT works efficiently as a two stepreaction in vivo or in vitro. Efficient resolution in vivo without aselection for the second recombination event requires incubation in LBplus IPTG after transformation prior to plating on selective media. Anincubation of 1 h and 4 h gave 3% and 15% recombinants, respectively,which showed complete loss of the pUNI backbone through recombinationbetween RS sequences. In vitro recombination catalyzed by the Rrecombinase achieved 30% recombinants.

[0228] The efficiency of recovering plasmids that have undergone POT canbe greatly enhanced through the use of a recipient vector in which acounter-selectable marker is placed between the loxP and RS sites. Forthis purpose, the present invention utilized the ΦX174 E gene which istoxic when expressed in E. coli unless the host cell lacks the slyD gene[Maratea et al. (1985) Gene 40:39]. pAS2-E, a two hybrid bait vectorderived from pAS2 [Durfee et al. (1994) Gene. & Dev. 7:555] whichcontains in a 5′ to 3′ order loxP, E under control of the tac promoter,and an RS site, was fused with pUNI-20, containing the SKP1 gene and theco-integrant was selected by transformation into CX1 (slyD⁻). Thisco-integrant was then transformed into BUN15 cells expressing the Rrecombinase and resolution events were isolated by selecting for Ap^(r)in the presence of IPTG to induce the E protein. Since BUN 15 is slyD⁺,pAS2-E alone cannot survive in it because of toxicity due to Eexpression. However, when pAS2-E is fused to pUNI-20 derivatives, it cantransform that strain because subsequent R-dependent site-specificrecombination between RS sites will eliminate both the pUNI backbone andE. This results in the replacement of E with the corresponding regionfrom pUNI. One hundred percent (24 of 24) Ap^(r) transformants resultingfrom the transformation of the pAS2-E-pUNI-20-SKP1 fusion plasmid showedprecise transfer of the SKP1 gene from pUNI-20 into pAS2-E with only 1hr incubation prior to plating on selective media.

[0229] Transformation of a heterodimeric plasmid with E flanked by RSsites into BUN15 gave a transformation several orders of magnitudegreater than transformation of the pAS2-E plasmid itself. Thisdemonstrated that POT can be achieved in a single step by directtransformation of a UPS reaction into BUN15 (i.e., rather than atwo-step process). pUNI-20-SKP1 and pAS2-E were incubated with Gst-Crein a standard UPS reaction and the reaction mixture was transformeddirectly into BUN15 and AP^(r) transformants were selected at 42° C.after an hour incubation. One hundred percent (20 of 20) of Ap^(r)transformants were found to have undergone POT with SKP1 replacing the Egene in pAS2-E as determined by restriction digestion with PvuII, asshown in FIG. 21. The sample shown in FIG. 21 was generated from plasmidDNA isolated from 10 different Ap^(r) transformants, digested asdescribed above along with two parental plasmids, P1 (pUNI-20-SKP1) andP2 (pAS2-E) and I (the UPS generated pUNI-20-SKP1-pAS2-E recombinationintermediate). Precise ORF transfer resulted in the generation of anovel 800 bp PvuII fragment indicated by the arrowhead.

[0230] For POT assays, BUN15 cells were grown overnight in LB containingspectinomycin (50 μg/ml) at 30° C. BUN15 cells were diluted 1 to 100 infresh media LB/Spec media containing 0.3 mM IPTG and grown to OD of 0.5.Electrocompetent cells were prepared as recommended (Biorad). Forty μlof competent cells were used in each transformation. After theelectrotransformation, cells were incubated in LB plus IPTG for 1-8 hrfor recovery before being plated on LB/Amp/IPTG 1 mM and incubated at42° C.

EXAMPLE 10 Library Transfer Using UPS

[0231] The ability to use the methods and compositions of the presentinvention for generating and subcloning entire nucleic acid libraries isdemonstrated in this Example. A random shear S. cerevisiae genomiclibrary was made in pUNI-10 using the XhoI-adaptor strategy [Elledge etal. (1991) Proc. Natl. Acad. Sci. 88:1731]. This library had 5×10⁵recombinants with 80% inserts ranging from 3 kb to 8 kb. This librarywas fused to pRS425-lox, a URA3 2μ plasmid, using UPS and 1.6×10⁶recombinant fusion plasmids were recovered. This library was used totransform an S. cerevisiae cdc4-1 mutant strain Y543 and Ura⁺transformants were selected at 34° C., the non-permissive temperature ofcdc4-1. Of 31 plasmids capable of conferring growth at 34° C., threeclasses were recovered. One class was CDC4 as expected, the second wasSKP1, and the third was CLB3. SKP1 and CLB4, a cyclin closely related toCLB3, had been previously shown to suppress cdc4-1 mutants whenoverexpressed from the GAL promoter [Bai et al. (1994) EMBO J. 3:6087;and Bai et al., supra]. These experiments demonstrate the feasibility oflibrary transfer using UPS. In cases where a cDNA expression library iscreated, such as for the two hybrid system, once clones have beenisolated, they can be rapidly converted back into simple Univectorclones by Cre recombination in vivo. Using UPS, these plasmids can nowbe rapidly fused with any of a series of pHOST expression vectors forfuture analytical needs.

EXAMPLE 11 General Material and Methods

[0232] This Example provides general materials and methods usedthroughout the experiments discussed above and below.

[0233] I. Media, Enzymes, and Chemicals

[0234] For drug selections, LB plates or liquid media were supplementedwith either kanamycin (40 μg/ml) or ampicillin (100 μg/ml). Whennecessary, isopropyl β-D-thiogalactoside (IPTG) was added to a finalconcentration of 0.3 mM and X-Gal (Sigma) was used at 80 μg/ml. Yeastgrowth media and plates were made according to Rose et al. [Rose et al.(1990) Laboratory course manual for methods in yeast genetics, ColdSpring Harbor, N.Y., Cold Spring Harbor Laboratory Press]. Restrictionendonucleases, large (klenow) fragment of E. coli DNA polymerase I, T4polynucleotide kinase, T4 DNA polymerase, T4 DNA ligase were purchasedfrom New England Biolabs. Drugs were purchased from Sigma if nototherwise specified.

[0235] II. Bacterial and Yeast Strains

[0236]E. coli BW23474 [Δlac-169, robA1, creC510, hsdR514,uidA(ΔMlul)::pir-116, endA, recA1] and BW23473 [Δlac-169, robA1,creC510, hsdR514, uidA(ΔMlul)::pir⁺, endA, recA1] (Metcalf et al.,supra) was a gift of B. Wanner and was used as host for propagation ofall Univector based plasmids. BUN10 [hisG4 thr-1 leuB6 t lacY1 kdgK51Δ(gpt-proA)62 rpsL31 tsx33 supE44 recB21 recC22 sbcA23hsdR::cat-pir-116(CmR)] was used for homologous recombinationexperiments. BUN13 which has cre under the control of the lac promoteris JM107 lysogenized with λ_(LC) (aadA lac-cre). BUN15 is XL1 bluecontaining pML66(tac-R, SP^(r)) and was used for the in vivo RSrecombination assays. E. coli JM107 or DH5α [Sambrook et al. (1989)Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Lab., ColdSpring Harbor, N.Y., 2nd Ed.] were the transformation recipients for allother plasmid construction, including those made by UPS. E. coli BL21was used as the host for bacterial expression studies. CX1 (ara leu purEgal trp his argG rpsL thi-1 supE lac^(Q) slyD1) was used for propagationof E expression clones. S. cerevisiae Y80 [Zhou and Elledge (1992)Genetics 131:851] was used for yeast expression studies and Y543 (as Y80but cdc4-1) was used for cdc4 suppression (Bai et al., 1994, supra).

[0237] III. Plasmid Construction

[0238] The construction of several of the plasmids used in the examplesof the present invention are provided below. These examples are providedto illustrate strategies and general methods used in making plasmids foruse in the UPS. However, these specific plasmids and methods ofconstruction are not required to practice the present invention.

[0239] For the Gst-Cre expression construct, pQL123, the cre ORF wasamplified by PCR and an NcoI site placed at the first ATG using primers5′-CCATGGCCAATTTACTGACCGTACAC-3′ (SEQ ID NO:21) and5′-CCCGGGCTAATCGCCATCTTCCAGC-3′ (SEQ ID NO:20). The PCR product wascloned into pCR™II (Invitrogen) and subcloned as a NcoI-EcoRI fragmentinto NcoI-EcoRI digested pGEX-2Tkcs to create pQL123.

[0240] The pHOST plasmid pQL103 was made by deleting one loxP site frompSE1086, which contains a XhoI-loxP-NotI-loxP-SalI cassette, bydigestion with NotI and SalI, filling in the ends with klenow andreligation. The 590 bp NcoI-BamHI fragment containing the S. cerevisiaeSKP1 ORF was subcloned from pCB149 into NcoI-BamHI-cut pUNI-10 to createpQL130(pUNI-SKP1).

[0241] A second subclone of SKP1 is pML73 which contains the same 5′ endof SKP1 but an additional 800 bp of genomic DNA to the next BamHI siteat the 3′ end cloned into pUNI-20. pML73 was used for the POTexperiments. An oligo linker containing loxP and flanked by NcoI andBamHI overhangs was made by annealing two oligos5′-CATGGCTATAACTTCGTATAGCATACATTATACGAAGTTATG-3′ (SEQ ID NO:22) and5′-GATCCATAACTTCGTATAATGTATGCTATACGAAGTTAT-3′ (SEQ ID NO:23), and thenligating into NcoI and BamHI digested pGEX-2TKcs to create pHB2-GST. TheMYC₃-RNR4 gene was subcloned from pMHl 76 [Huang and Elledge (1997) Mol.Cell. Biol. 17:6105] as a XhoI-SacI fragment into XhoI-SacI-cleavedpUNI-10 to create pQL248, or into SalI-SacI digested pBAD104, a GAL1expression vector to create the control lacking loxP. Two pBAD104derived recipient vectors, pQL138 and pQL193, were constructed byinsertion of either a wild type loxP of loxP³⁶⁹ sequence into thepolylinker using primer pairs:

[0242] 5′ -TCGAGACGTCATAACTTCGTATAGCATACATTATACGAAGTTATGC-3′ (SEQ IDNO:24) and

[0243] 5′-GCCGCATAACTTCGTATAATGTATGCTATACGATGTTATGACGTC-3′ (SEQ IDNO:25) (pQL138), or

[0244] 5′-CATGGCTATAACTTCGTATAGCATACATTATACGAAGTTATG-3′ (SEQ ID NO:26)and

[0245] 5′-GATCCATAACTTCGTATAATGTATGCTATACGAAGTTATAGC-3′ (SEQ ID NO:27)(pQL193). Two GAL1:MYC₃-RNR4 constructs were made by UPS between pQL248and pQL138 or pQL193.

[0246] For the construction of pQL269 (lac-cre aadA on a Ts pSC101 ori),the EcoRI-PvuII fragment from pQL 114 containing aadA and the lac-cregene fusion was ligated to a BglI (made blunt by T4 polymerase)-EcoRIfragment from pINT-ts [Hasan et al. (1994) Gene 150:51] containing theTs replication origin and transformants were screened for Sp^(R) and Tsgrowth at 42° C. A plasmid with those properties was designated pQL269.

[0247] pML66 was constructed by ligating the EcoRI-SalI (blunt) fragmentcontaining the tac promoter driving the R recombinase from pNN115 (Arakiet al., supra) into EcoRI-PstI (blunt) cleaved pQL269. Thisspectinomycin resistant plasmid expresses R protein in the presence ofIPTG and is lost from cells grown at 42° C. because of a temperaturesensitive replication mutation.

[0248] pUNI-Amp was made by placing the bla gene from pUC 19 in place ofthe neo gene on pUNI-20 by generating a PCR product of bla and ligatingthat into MluI-NheI (blunt) cleaved pUNI-20. The subcloning of thetriple MYC tag into pUNI-Amp was accomplished by PCR amplification ofthe 3×MYC tag present of pJBN48 by the primers MZL154,5′-AAATTTCTCGAGGCTCTGAGCAAAAGCTCAT-3′ (SEQ ID NO:28) and MZL155,5′-TATATATAGCGGCCGCTTAATTAAGATCCTCCTCGGATA-3′ (SEQ ID NO:29), followedby cleavage of the PCR product with XhoI and NotI and ligation intoXhoI-NotI cleaved pUNI-Amp to generate pML74. Sequence of the PCRprimers used to amplify the 3×MYC tag from pML74 for tagging theC-terminus of SKP1 by homologous recombination were primer A (MZL160) 5′-CCAGAGGAGGAGGCTGCCATTAGGCGTGAAAATGAATGGGCTGAAGACCGTCTGAGCAAAAGCTCATTTC-3′ (SEQ ID NO:30) and primer B (MZL161)5′-GGATATAGTTCCTCCTTTCAGC (SEQ ID NO:31).

[0249] pAS2-E was constructed by first placing a synthetic loxP sitebetween the NcoI-SalI sites of pAS2 to make pAS2-lox, and thengenerating a E-containing fragment with the following features: 5′ XhoIsite, tac promoter driving E, SpeI site 3′ and ligated the XhoI-SpeIfragment together with a SpeI-PstI synthetic RS fragment into XhoI-PstIcleaved pAS2-lox to make pAS2-E (pML71).

[0250] IV. β-galactosidase Assays

[0251] Yeast cells expressing the GAL1:lacZ reporter constructscontaining different loxP sequences were grown at 30° C. to mid-logphase (OD₆₀₀=0.5-0.6) in SC-Ura media containing 2% raffinose, galactosewas added to 2% final, and cells were incubated at 30° C. for two hours.β-galactosidase activities were measured as described by Zhou andElledge (Zhou and Elledge, supra).

EXAMPLE 12 Construction of BUN13

[0252] This Example describes the construction of BUN13, a lambdalysogen with cre under lac control. pSE356 contains a cassetteconsisting of the Tn5 neo gene, the lac promoter, and a polylinkersequence surrounded by stretches of λ DNA sequence. pQL 114, the plasmidused to recombine the cre gene into λ, was constructed in two steps.First, the BamHI-HindIII (made blunt by T4 DNA polymerase) fragmentcontaining the spectinomycin resistance gene aadA from pDPT270 [Taylorand Cohen (1979) J. Bacteriol 137:92] was subcloned into BamHI-SphI(made blunt by T4 DNA polymerase digested pSE356) to createpQL102,replacing neo with aadA. Secondly, a NotI site was engineered atthe 5′ end of the ribosomal binding site of the cre gene by PCR usingprimers 5′-GCGGCCGCTGAGTGTTAAATGTCCAATT-3′ (SEQ ID NO:19) and5′-CCCGGGCTAATCGCCATCTTCCAGC-3′ (SEQ ID NO:20). The PCR product wascloned into pCR™II and subcloned as a NotI-EcoRI fragment intoNotI-EcoRI digested pQL102 to create pQL114, placing cre under laccontrol adjacent to aadA and flanked by λ DNA sequence. λ^(KC) (Elledgeet al., supra) was amplified on JM107 containing pQL114 and theresulting phage lysate containing the desired recombinant λ_(LC) phagewas used to infect JM107. Sp^(r)Kn^(s) lysogens were selected and testedfor Cre expression and the ability to perform UPS. One strain with thoseproperties was designated BUN13.

[0253] It is clear from the above that the present invention providesmethods for the subcloning of nucleic acid molecules that permit therapid transfer of a target nucleic acid sequence (e.g., a gene ofinterest) from nucleic acid molecule to another in vitro or in vivowithout the need to rely upon restriction enzyme digestions.

[0254] All publications and patents mentioned in the above specificationare herein incorporated by reference. Various modifications andvariations of the described method and system of the invention will beapparent to those skilled in the art without departing from the scopeand spirit of the invention. Although the invention has been describedin connection with specific preferred embodiments, it should beunderstood that the invention as claimed should not be unduly limited tosuch specific embodiments. Indeed, various modifications of thedescribed modes for carrying out the invention which are obvious tothose skilled in molecular biology or related fields are intended to bewithin the scope of the following claims.

1 32 1 2220 DNA Artificial Sequence Description of Artificial SequenceSynthetic 1 aattctgtca gccgttaagt gttcctgtgt cactgaaaat tgctttgagaggctctaagg 60 gcttctcagt gcgttacatc cctggcttgt tgtccacaac cgttaaaccttaaaagcttt 120 aaaagcctta tatattcttt tttttcttat aaaacttaaa accttagaggctatttaagt 180 tgctgattta tattaatttt attgttcaaa catgagagct tagtacgtgaaacatgagag 240 cttagtacgt tagccatgag agcttagtac gttagccatg agggtttagttcgttaaaca 300 tgagagctta gtacgttaaa catgagagct tagtacgtga aacatgagagcttagtacgt 360 actatcaaca ggttgaactg ctgatcaaca gatcctctac gcggccgcggtaccataact 420 tcgtatagca tacattatac gaagttatct ggaattcccc gggctcgagaacatatggcc 480 atggggatcc gcggccgcaa ttgttaacag atccgtcgac gagctcgctatcagcctcga 540 ctgtgccttc tagttgccag ccatctgttg tttgcccctc ccccgtgccttccttgaccc 600 tggaaggtgc cactcccact gtcctttcct aataaaatga ggaaattgcatcgcattgtc 660 tgagtaggtg tcattctatt ctggggggtg gggtggggca ggacagcaagggggaggatt 720 gggaagacaa tagcaggcat gctggggatt ctagaagatc cggctgctaacaaagcccga 780 aaggaagctg agttggctgc tgccaccgct gagcaataac tagcataaccccttggggcc 840 tctaaacggg tcttgagggg ttttttgctg aaaggaggaa ctatatccggatatcccggg 900 gtgggcgaag aactccagca tgagatcccc gcgctggagg atcatccagccggcgtcccg 960 gaaaacgatt ccgaagccca acctttcata gaaggcggcg gtggaatcgaaatctcgtga 1020 tggcaggttg ggcgtcgctt ggtcggtcat ttcgaacccc agagtcccgctcagaagaac 1080 tcgtcaagaa ggcgatagaa ggcgatgcgc tgcgaatcgg gagcggcgataccgtaaagc 1140 acgaggaagc ggtcagccca ttcgccgcca agctcttcag caatatcacgggtagccaac 1200 gctatgtcct gatagcggtc cgccacaccc agccggccac agtcgatgaatccagaaaag 1260 cggccatttt ccaccatgat attcggcaag caggcatcgc catgggtcacgacgagatcc 1320 tcgccgtcgg gcatgcgcgc cttgagcctg gcgaacagtt cggctggcgcgagcccctga 1380 tgctcttcgt ccagatcatc ctgatcgaca agaccggctt ccatccgagtacgtgctcgc 1440 tcgatgcgat gtttcgcttg gtggtcgaat gggcaggtag ccggatcaagcgtatgcagc 1500 cgccgcattg catcagccat gatggatact ttctcggcag gagcaaggtgagatgacagg 1560 agatcctgcc ccggcacttc gcccaatagc agccagtccc ttcccgcttcagtgacaacg 1620 tcgagcacag ctgcgcaagg aacgcccgtc gtggccagcc acgatagccgcgctgcctcg 1680 tcctgcagtt cattcagggc accggacagg tcggtcttga caaaaagaaccgggcgcccc 1740 tgcgctgaca gccggaacac ggcggcatca gagcagccga ttgtctgttgtgcccagtca 1800 tagccgaata gcctctccac ccaagcggcc ggagaacctg cgtgcaatccatcttgttca 1860 atcatgcgaa acgatcctca tcctgtctct tgatcagatc ttgatcccctgcgccatcag 1920 atccttggcg gcaagaaagc catccagttt actttgcagg gcttcccaaccttaccagag 1980 ggcgccccag ctggcaattc cggttcgctt gctgtccata aaaccgcccagtctagctat 2040 cgccatgtaa gcccactgca agctacctgc tttctctttg cgcttgcgttttcccttgtc 2100 cagatagccc agtagctgac attcatccgg ggtcagcacc gtttctgcggactggctttc 2160 tacgtgttcc gcttccttta gcagcccttg cgccctgagt gcttgcggcagcgtgaagct 2220 2 16 DNA Artificial Sequence Description of ArtificialSequence Synthetic 2 ggatccccgg gaattc 16 3 36 DNA Artificial SequenceDescription of Artificial Sequence Synthetic 3 ggatcgcata tgcccatggctcgaggatcc gaattc 36 4 42 DNA Artificial Sequence Description ofArtificial Sequence Synthetic 4 catggctata acttcgtata gcatacattatacgaagtta tg 42 5 42 DNA Artificial Sequence Description of ArtificialSequence Synthetic 5 gatccataac ttcgtataat gtatgctata cgaagttata gc 42 646 DNA Artificial Sequence Description of Artificial Sequence Synthetic6 ggccggacgt cataacttcg tatagcatac attatacgaa gttatg 46 7 46 DNAArtificial Sequence Description of Artificial Sequence Synthetic 7gatccataac ttcgtataat gtatgctata cgaagttatg acgtcc 46 8 46 DNAArtificial Sequence Description of Artificial Sequence Synthetic 8tcgagacgtc ataacttcgt atagcataca ttatacgaag ttatgc 46 9 46 DNAArtificial Sequence Description of Artificial Sequence Synthetic 9ggccgcataa cttcgtataa tgtatgctat acgaagttat gacgtc 46 10 1740 DNAArtificial Sequence Description of Artificial Sequence Synthetic 10atgtccccta tactaggtta ttggaaaatt aagggccttg tgcaacccac tcgacttctt 60ttggaatatc ttgaagaaaa atatgaagag catttgtatg agcgcgatga aggtgataaa 120tggcgaaaca aaaagtttga attgggtttg gagtttccca atcttcctta ttatattgat 180ggtgatgtta aattaacaca gtctatggcc atcatacgtt atatagctga caagcacaac 240atgttgggtg gttgtccaaa agagcgtgca gagatttcaa tgcttgaagg agcggttttg 300gatattagat acggtgtttc gagaattgca tatagtaaag actttgaaac tctcaaagtt 360gattttctta gcaagctacc tgaaatgctg aaaatgttcg aagatcgttt atgtcataaa 420acatatttaa atggtgatca tgtaacccat cctgacttca tgttgtatga cgctcttgat 480gttgttttat acatggaccc aatgtgcctg gatgcgttcc caaaattagt ttgttttaaa 540aaacgtattg aagctatccc acaaattgat aagtacttga aatccagcaa gtatatagca 600tggcctttgc agggctggca agccacgttt ggtggtggcg accatcctcc aaaatcggat 660ctggttccgc gtggatctcg tcgtgcatct gttggatcgc atatgcccat ggccaattta 720ctgaccgtac accaaaattt gcctgcatta ccggtcgatg caacgagtga tgaggttcgc 780aagaacctga tggacatgtt cagggatcgc caggcgtttt ctgagcatac ctggaaaatg 840cttctgtccg tttgccggtc gtgggcggca tggtgcaagt tgaataaccg gaaatggttt 900cccgcagaac ctgaagatgt tcgcgattat cttctatatc ttcaggcgcg cggtctggca 960gtaaaaacta tccagcaaca tttgggccag ctaaacatgc ttcatcgtcg gtccgggctg 1020ccacgaccaa gtgacagcaa tgctgtttca ctggttatgc ggcggatccg aaaagaaaac 1080gttgatgccg gtgaacgtgc aaaacaggct ctagcgttcg aacgcactga tttcgaccag 1140gttcgttcac tcatggaaaa tagcgatcgc tgccaggata tacgtaatct ggcatttctg 1200gggattgctt ataacaccct gttacgtata gccgaaattg ccaggatcag ggttaaagat 1260atctcacgta ctgacggtgg gagaatgtta atccatattg gcagaacgaa aacgctggtt 1320agcaccgcag gtgtagagaa ggcacttagc ctgggggtaa ctaaactggt cgagcgatgg 1380atttccgtct ctggtgtagc tgatgatccg aataactacc tgttttgccg ggtcagaaaa 1440aatggtgttg ccgcgccatc tgccaccagc cagctatcaa ctcgcgccct ggaagggatt 1500tttgaagcaa ctcatcgatt gatttacggc gctaaggatg actctggtca gagatacctg 1560gcctggtctg gacacagtgc ccgtgtcgga gccgcgcgag atatggcccg cgctggagtt 1620tcaataccgg agatcatgca agctggtggc tggaccaatg taaatattgt catgaactat 1680atccgtaacc tggatagtga aacaggggca atggtgcgcc tgctggaaga tggcgattag 174011 579 PRT Artificial Sequence Description of Artificial SequenceSynthetic 11 Met Ser Pro Ile Leu Gly Tyr Trp Lys Ile Lys Gly Leu Val GlnPro 1 5 10 15 Thr Arg Leu Leu Leu Glu Tyr Leu Glu Glu Lys Tyr Glu GluHis Leu 20 25 30 Tyr Glu Arg Asp Glu Gly Asp Lys Trp Arg Asn Lys Lys PheGlu Leu 35 40 45 Gly Leu Glu Phe Pro Asn Leu Pro Tyr Tyr Ile Asp Gly AspVal Lys 50 55 60 Leu Thr Gln Ser Met Ala Ile Ile Arg Tyr Ile Ala Asp LysHis Asn 65 70 75 80 Met Leu Gly Gly Cys Pro Lys Glu Arg Ala Glu Ile SerMet Leu Glu 85 90 95 Gly Ala Val Leu Asp Ile Arg Tyr Gly Val Ser Arg IleAla Tyr Ser 100 105 110 Lys Asp Phe Glu Thr Leu Lys Val Asp Phe Leu SerLys Leu Pro Glu 115 120 125 Met Leu Lys Met Phe Glu Asp Arg Leu Cys HisLys Thr Tyr Leu Asn 130 135 140 Gly Asp His Val Thr His Pro Asp Phe MetLeu Tyr Asp Ala Leu Asp 145 150 155 160 Val Val Leu Tyr Met Asp Pro MetCys Leu Asp Ala Phe Pro Lys Leu 165 170 175 Val Cys Phe Lys Lys Arg IleGlu Ala Ile Pro Gln Ile Asp Lys Tyr 180 185 190 Leu Lys Ser Ser Lys TyrIle Ala Trp Pro Leu Gln Gly Trp Gln Ala 195 200 205 Thr Phe Gly Gly GlyAsp His Pro Pro Lys Ser Asp Leu Val Pro Arg 210 215 220 Gly Ser Arg ArgAla Ser Val Gly Ser His Met Pro Met Ala Asn Leu 225 230 235 240 Leu ThrVal His Gln Asn Leu Pro Ala Leu Pro Val Asp Ala Thr Ser 245 250 255 AspGlu Val Arg Lys Asn Leu Met Asp Met Phe Arg Asp Arg Gln Ala 260 265 270Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser Val Cys Arg Ser Trp 275 280285 Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp Phe Pro Ala Glu Pro 290295 300 Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gln Ala Arg Gly Leu Ala305 310 315 320 Val Lys Thr Ile Gln Gln His Leu Gly Gln Leu Asn Met LeuHis Arg 325 330 335 Arg Ser Gly Leu Pro Arg Pro Ser Asp Ser Asn Ala ValSer Leu Val 340 345 350 Met Arg Arg Ile Arg Lys Glu Asn Val Asp Ala GlyGlu Arg Ala Lys 355 360 365 Gln Ala Leu Ala Phe Glu Arg Thr Asp Phe AspGln Val Arg Ser Leu 370 375 380 Met Glu Asn Ser Asp Arg Cys Gln Asp IleArg Asn Leu Ala Phe Leu 385 390 395 400 Gly Ile Ala Tyr Asn Thr Leu LeuArg Ile Ala Glu Ile Ala Arg Ile 405 410 415 Arg Val Lys Asp Ile Ser ArgThr Asp Gly Gly Arg Met Leu Ile His 420 425 430 Ile Gly Arg Thr Lys ThrLeu Val Ser Thr Ala Gly Val Glu Lys Ala 435 440 445 Leu Ser Leu Gly ValThr Lys Leu Val Glu Arg Trp Ile Ser Val Ser 450 455 460 Gly Val Ala AspAsp Pro Asn Asn Tyr Leu Phe Cys Arg Val Arg Lys 465 470 475 480 Asn GlyVal Ala Ala Pro Ser Ala Thr Ser Gln Leu Ser Thr Arg Ala 485 490 495 LeuGlu Gly Ile Phe Glu Ala Thr His Arg Leu Ile Tyr Gly Ala Lys 500 505 510Asp Asp Ser Gly Gln Arg Tyr Leu Ala Trp Ser Gly His Ser Ala Arg 515 520525 Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly Val Ser Ile Pro Glu 530535 540 Ile Met Gln Ala Gly Gly Trp Thr Asn Val Asn Ile Val Met Asn Tyr545 550 555 560 Ile Arg Asn Leu Asp Ser Glu Thr Gly Ala Met Val Arg LeuLeu Glu 565 570 575 Asp Gly Asp 12 34 DNA Artificial SequenceDescription of Artificial Sequence Synthetic 12 ataacttcgt atagcatacattatacgaag ttat 34 13 34 DNA Artificial Sequence Description ofArtificial Sequence Synthetic 13 attacctcgt atagcataca ttatacgaag ttat34 14 34 DNA Artificial Sequence Description of Artificial SequenceSynthetic 14 ataacttcgt atagcataca ttatatgaag ttat 34 15 34 DNAArtificial Sequence Description of Artificial Sequence Synthetic 15attacctcgt atagcataca ttatatgaag ttat 34 16 34 DNA Artificial SequenceDescription of Artificial Sequence Synthetic 16 ataacttcgt atagtatacattatacgaag ttat 34 17 34 DNA Artificial Sequence Description ofArtificial Sequence Synthetic 17 acaacttcgt ataatgtatg ctatacgaag ttat34 18 34 DNA Artificial Sequence Description of Artificial SequenceSynthetic 18 gaagttccta ttctctagaa agtataggaa cttc 34 19 28 DNAArtificial Sequence Description of Artificial Sequence Synthetic 19gcggccgctg agtgttaaat gtccaatt 28 20 25 DNA Artificial SequenceDescription of Artificial Sequence Synthetic 20 cccgggctaa tcgccatcttccagc 25 21 26 DNA Artificial Sequence Description of ArtificialSequence Synthetic 21 ccatggccaa tttactgacc gtacac 26 22 42 DNAArtificial Sequence Description of Artificial Sequence Synthetic 22catggctata acttcgtata gcatacatta tacgaagtta tg 42 23 39 DNA ArtificialSequence Description of Artificial Sequence Synthetic 23 gatccataacttcgtataat gtatgctata cgaagttat 39 24 46 DNA Artificial SequenceDescription of Artificial Sequence Synthetic 24 tcgagacgtc ataacttcgtatagcataca ttatacgaag ttatgc 46 25 45 DNA Artificial SequenceDescription of Artificial Sequence Synthetic 25 gccgcataac ttcgtataatgtatgctata cgatgttatg acgtc 45 26 42 DNA Artificial Sequence Descriptionof Artificial Sequence Synthetic 26 catggctata acttcgtata gcatacattatacgaagtta tg 42 27 42 DNA Artificial Sequence Description of ArtificialSequence Synthetic 27 gatccataac ttcgtataat gtatgctata cgaagttata gc 4228 31 DNA Artificial Sequence Description of Artificial SequenceSynthetic 28 aaatttctcg aggctctgag caaaagctca t 31 29 39 DNA ArtificialSequence Description of Artificial Sequence Synthetic 29 tatatatagcggccgcttaa ttaagatcct cctcggata 39 30 70 DNA Artificial SequenceDescription of Artificial Sequence Synthetic 30 ccagaggagg aggctgccattaggcgtgaa aatgaatggg ctgaagaccg tctgagcaaa 60 agctcatttc 70 31 22 DNAArtificial Sequence Description of Artificial Sequence Synthetic 31ggatatagtt cctcctttca gc 22 32 34 DNA Artificial Sequence Description ofArtificial Sequence Synthetic 32 attacctcat atagcataca ttatacgaag ttat34

We claim:
 1. A method for the recombination of nucleic acid constructs, comprising: a) providing: i) a first nucleic acid construct comprising, in operable order, an origin of replication, a first sequence-specific recombinase target site, and a nucleic acid of interest; ii) a second nucleic acid construct comprising, in operable order, an origin of replication, a regulatory element and a second sequence-specific recombinase target site adjacent to and downstream from said regulatory element; and iii) a site-specific recombinase; b) contacting said first and said second nucleic acid constructs with said site-specific recombinase under conditions such that said first and second nucleic acid constructs are recombined to form a third nucleic acid construct, wherein said nucleic acid of interest is operably linked to said regulatory element.
 2. The method of claim 1, wherein said regulatory element comprises a promoter element.
 3. The method of claim 1, wherein said regulatory element comprises a fusion peptide.
 4. The method of claim 3, wherein said fusion peptide comprises an affinity domain.
 5. The method of claim 3, wherein said fusion peptide comprises an epitope tag.
 6. The method of claim 1, wherein said nucleic acid of interest comprises a gene.
 7. The method of claim 1, wherein said first nucleic acid construct further comprises a selectable marker.
 8. The method of claim 1, wherein said second nucleic acid construct further comprises a selectable marker.
 9. The method of claim 1, wherein said first nucleic acid construct further comprises a prokaryotic termination sequence.
 10. The method of claim 1, wherein said first nucleic acid construct further comprises a eukaryotic polyadenylation sequence.
 11. The method of claim 1, wherein said first nucleic acid construct further comprises a conditional origin of replication.
 12. The method of claim 1, wherein said first sequence-specific recombinase target site is selected from the group consisting of loxP, loxP2, loxP3, loxP23, loxP511, loxB, loxC2, loxL, loxR, loxΔ86, loxΔ117, frt, dif, loxH and att.
 13. The method of claim 1, wherein said second sequence-specific recombinase target site is selected from the group consisting of loxP, loxP2, loxP3, loxP23, loxP511, loxB, loxC2, loxL, loxR, loxΔ86, loxΔ117, frt, dif, loxH and att.
 14. The method of claim 1, wherein said first nucleic acid construct further comprises a polylinker.
 15. The method of claim 1, wherein said contacting said first and said second nucleic acid constructs with said site-specific recombinase comprises introducing said first and said second nucleic acid constructs into a host cell under conditions such that said third nucleic acid construct is capable of replicating in said host cell.
 16. The method of claim 15, wherein said site-specific recombinase is encoded by said host cell.
 17. The method of claim 1, wherein said first nucleic acid construct further comprises a third sequence-specific recombinase target site and said second nucleic acid constructs further comprises a fourth sequence-specific recombinase target site.
 18. The method of claim 17, wherein said first sequence-specific recombinase target site and said third sequence-specific recombinase target site in said first nucleic acid construct are located on opposite sides of said nucleic acid of interest.
 19. The method of claim 17, wherein in said third and fourth sequence-specific recombinase target sites are selected from the group consisting of RS sites and Res sites.
 20. The method of claim 1, wherein said first nucleic acid construct further comprises a third sequence-specific recombinase target site and said second nucleic acid constructs further comprises a fourth sequence-specific recombinase target site, wherein the method further comprises providing a second site-specific recombinase and step c) contacting said third nucleic acid construct with said second site-specific recombinase under conditions such that said third nucleic acid construct is recombined to form a fourth and a fifth nucleic acid construct.
 21. A recombined nucleic acid construct prepared according to the method of claim
 1. 22. A method for the recombination of nucleic acid constructs, comprising: a) providing: i) a vector; ii) a linear nucleic acid molecule comprising a sequence complementary to at least a portion of said vector; and iii) an E. coli host cell, wherein said host cell comprises an endogenous recombination system, a loss of function rec mutation, a suppressor, and a loss of function endogenous restriction modification system mutation; and b) introducing said vector and said linear nucleic acid molecule into said host cell under conditions such that said linear nucleic acid molecule and said vector are recombined to form a recombinant nucleic acid construct.
 23. The method of claim 22, wherein said loss of function rec mutation is selected from the group consisting of recBC and recD.
 24. The method of claim 22, wherein said suppressor comprises sbc.
 25. The method of claim 22, wherein said loss of function endogenous restriction modification system mutation comprises hsdR.
 26. A method for the cloning of nucleic acid libraries, comprising: a) providing: i) a plurality of first nucleic acid constructs comprising, in operable order, an origin of replication, a first sequence-specific recombinase target site, and a nucleic acid member from a nucleic acid library; ii) a plurality of second nucleic acid constructs comprising, in operable order, an origin of replication, a regulatory element and a second sequence-specific recombinase target site adjacent to and downstream from said regulatory element; and iii) a site-specific recombinase; b) contacting said plurality of first and second nucleic acid constructs with said site-specific recombinase under conditions such that said plurality of first and second nucleic acid constructs are recombined to form a plurality of third nucleic acid constructs, wherein said nucleic acid members from said nucleic acid library are operably linked to said regulatory elements.
 27. A nucleic acid library prepared according to the method of claim
 26. 28. A method for the directional cloning of a nucleic acid molecule, comprising: a) providing: i) first and second portions of a regulatory element; ii) a first nucleic acid molecule comprising said first portion of said regulatory element; and iii) a second nucleic acid molecule comprising said second portion of said regulatory element; and b) combining said first and said second nucleic acid molecules to produce a third nucleic acid molecule under conditions whereby an intact regulatory element is produced from the combination of said first and said second portions of said regulatory element, wherein the presence of said intact regulatory element in said third nucleic acid molecule indicates a direction of cloning of said first nucleic acid molecule with respect to said second nucleic acid molecule.
 29. The method of claim 28, wherein said regulatory element comprises a lacO site.
 30. A method for regulated recombination in host cells that constitutively express a recombinase, comprising: a) providing: i) a host cell expressing a recombinase; ii) a first nucleic acid construct comprising an origin of replication, a first site-specific recombinase site, a second site-specific recombinase site that differs in sequence from said first site-specific recombinase site such that said recombinase will not initiate recombination between said first and second site-specific recombinase sites, and a selectable marker gene between said first and second site-specific recombinase sites; and iii) a second nucleic acid construct comprising an origin of replication, a third site-specific recombinase target site, and a fourth site-specific recombinase target site that differs in sequence from said third site-specific recombinase site such that said recombinase will not initiate recombination between said third and fourth site-specific recombinase sites; and b) introducing said first and second nucleic acid constructs into said host cell under conditions such that said first and second nucleic acid constructs are recombined.
 31. The method of claim 30, further comprising the step of selecting for a desired recombinant nucleic acid molecule using said selectable marker.
 32. The method of claim 30, wherein said first nucleic acid construct is a Univector.
 33. The method of claim 30, wherein said second nucleic acid construct is a Univector.
 34. A host cell expressing a recombinant nucleic acid construct prepared according to the method of claim 30, wherein said host cell constitutively expresses a recombinase.
 35. A method for the recombination of nucleic acid constructs, comprising: a) providing: i) a first nucleic acid construct comprising a loxH site; ii) a second nucleic acid construct comprising a loxH site; and iii) a site-specific recombinase; and b) contacting said first and said second nucleic acid constructs with said site-specific recombinase under conditions such that said first and second nucleic acid constructs are recombined.
 36. A recombined nucleic acid construct prepared according to the method of claim
 35. 